From patchwork Tue Apr 27 20:42:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227225 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC7F0C433B4 for ; Tue, 27 Apr 2021 20:44:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6D51861407 for ; Tue, 27 Apr 2021 20:44:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6D51861407 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 473936B0070; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 422B86B0036; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 09D826B0073; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0230.hostedemail.com [216.40.44.230]) by kanga.kvack.org (Postfix) with ESMTP id CF97D6B0036 for ; Tue, 27 Apr 2021 16:44:11 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 90BEB180AD81A for ; Tue, 27 Apr 2021 20:44:11 +0000 (UTC) X-FDA: 78079324302.05.9CC1F7F Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf10.hostedemail.com (Postfix) with ESMTP id B138240002C8 for ; Tue, 27 Apr 2021 20:43:58 +0000 (UTC) IronPort-SDR: bbKgb0DU4r67/xCWrsmy8ZfN0kMKEHJ/U/Dc1+V5PTdFQCJtIAPxSWmXbU/MafrvrVBKLKkCna GyfoP16uHcYg== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473631" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473631" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:06 -0700 IronPort-SDR: ukNBc4WAHGwx0qA0I5gYHT70eiUWn7SEK3ryavhqrCAf9AkZnixBD3luxMZ2NfpP1oETrbSMCm U0ItHrWtokww== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623425" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:06 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu Subject: [PATCH v26 01/30] Documentation/x86: Add CET description Date: Tue, 27 Apr 2021 13:42:46 -0700 Message-Id: <20210427204315.24153-2-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: B138240002C8 X-Stat-Signature: s9jk964igjijsotdto6hphnmie1gj9nw Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556238-414641 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Explain no_user_shstk/no_user_ibt kernel parameters, and introduce a new document on Control-flow Enforcement Technology (CET). Signed-off-by: Yu-cheng Yu Cc: Kees Cook --- v24: - Update for Kconfig changes from X86_CET to X86_SHADOW_STACK, X86_IBT. - Update for the change of VM_SHSTK to VM_SHADOW_STACK. .../admin-guide/kernel-parameters.txt | 6 + Documentation/x86/index.rst | 1 + Documentation/x86/intel_cet.rst | 136 ++++++++++++++++++ 3 files changed, 143 insertions(+) create mode 100644 Documentation/x86/intel_cet.rst diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 04545725f187..bc79e54be91e 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3220,6 +3220,12 @@ noexec=on: enable non-executable mappings (default) noexec=off: disable non-executable mappings + no_user_shstk [X86-64] Disable Shadow Stack for user-mode + applications + + no_user_ibt [X86-64] Disable Indirect Branch Tracking for user-mode + applications + nosmap [X86,PPC] Disable SMAP (Supervisor Mode Access Prevention) even if it is supported by processor. diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst index 4693e192b447..cf5250a3cc70 100644 --- a/Documentation/x86/index.rst +++ b/Documentation/x86/index.rst @@ -21,6 +21,7 @@ x86-specific Documentation tlb mtrr pat + intel_cet intel-iommu intel_txt amd-memory-encryption diff --git a/Documentation/x86/intel_cet.rst b/Documentation/x86/intel_cet.rst new file mode 100644 index 000000000000..ae30c392994a --- /dev/null +++ b/Documentation/x86/intel_cet.rst @@ -0,0 +1,136 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================================= +Control-flow Enforcement Technology (CET) +========================================= + +[1] Overview +============ + +Control-flow Enforcement Technology (CET) is an Intel processor feature +that provides protection against return/jump-oriented programming (ROP) +attacks. It can be set up to protect both applications and the kernel. +Only user-mode protection is implemented in the 64-bit kernel, including +support for running legacy 32-bit applications. + +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is +a secondary stack allocated from memory and cannot be directly modified by +applications. When executing a CALL instruction, the processor pushes the +return address to both the normal stack and the shadow stack. Upon +function return, the processor pops the shadow stack copy and compares it +to the normal stack copy. If the two differ, the processor raises a +control-protection fault. Indirect branch tracking verifies indirect +CALL/JMP targets are intended as marked by the compiler with 'ENDBR' +opcodes. + +There are two Kconfig options: + + X86_SHADOW_STACK, and X86_IBT. + +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1 +or later are required. To build a CET-enabled application, GLIBC v2.28 or +later is also required. + +There are two command-line options for disabling CET features:: + + no_user_shstk - disables user shadow stack, and + no_user_ibt - disables user indirect branch tracking. + +At run time, /proc/cpuinfo shows CET features if the processor supports +CET. + +[2] Application Enabling +======================== + +An application's CET capability is marked in its ELF header and can be +verified from readelf/llvm-readelf output: + + readelf -n | grep -a SHSTK + properties: x86 feature: IBT, SHSTK + +If an application supports CET and is statically linked, it will run with +CET protection. If the application needs any shared libraries, the loader +checks all dependencies and enables CET when all requirements are met. + +[3] Backward Compatibility +========================== + +GLIBC provides a few CET tunables via the GLIBC_TUNABLES environment +variable: + +GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-IBT + Turn off SHSTK/IBT. + +GLIBC_TUNABLES=glibc.tune.x86_shstk= + This controls how dlopen() handles SHSTK legacy libraries:: + + on - continue with SHSTK enabled; + permissive - continue with SHSTK off. + +Details can be found in the GLIBC manual pages. + +[4] CET arch_prctl()'s +====================== + +Several arch_prctl()'s have been added for CET: + +arch_prctl(ARCH_X86_CET_STATUS, u64 *addr) + Return CET feature status. + + The parameter 'addr' is a pointer to a user buffer. + On returning to the caller, the kernel fills the following + information:: + + *addr = shadow stack/indirect branch tracking status + *(addr + 1) = shadow stack base address + *(addr + 2) = shadow stack size + +arch_prctl(ARCH_X86_CET_DISABLE, unsigned int features) + Disable shadow stack and/or indirect branch tracking as specified in + 'features'. Return -EPERM if CET is locked. + +arch_prctl(ARCH_X86_CET_LOCK) + Lock in all CET features. They cannot be turned off afterwards. + +Note: + There is no CET-enabling arch_prctl function. By design, CET is enabled + automatically if the binary and the system can support it. + +[5] The implementation of the Shadow Stack +========================================== + +Shadow Stack size +----------------- + +A task's shadow stack is allocated from memory to a fixed size of +MIN(RLIMIT_STACK, 4 GB). In other words, the shadow stack is allocated to +the maximum size of the normal stack, but capped to 4 GB. However, +a compat-mode application's address space is smaller, each of its thread's +shadow stack size is MIN(1/4 RLIMIT_STACK, 4 GB). + +Signal +------ + +The main program and its signal handlers use the same shadow stack. +Because the shadow stack stores only return addresses, a large shadow +stack covers the condition that both the program stack and the signal +alternate stack run out. + +The kernel creates a restore token for the shadow stack restoring address +and verifies that token when restoring from the signal handler. + +Fork +---- + +The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are required +to be read-only and dirty. When a shadow stack PTE is not RO and dirty, a +shadow access triggers a page fault with the shadow stack access bit set +in the page fault error code. + +When a task forks a child, its shadow stack PTEs are copied and both the +parent's and the child's shadow stack PTEs are cleared of the dirty bit. +Upon the next shadow stack access, the resulting shadow stack page fault +is handled by page copy/re-use. + +When a pthread child is created, the kernel allocates a new shadow stack +for the new thread. From patchwork Tue Apr 27 20:42:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A867C43460 for ; Tue, 27 Apr 2021 20:44:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2489F613B0 for ; Tue, 27 Apr 2021 20:44:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2489F613B0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7159A6B0036; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 620846B0073; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 383586B0072; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0037.hostedemail.com [216.40.44.37]) by kanga.kvack.org (Postfix) with ESMTP id E57DD6B0071 for ; Tue, 27 Apr 2021 16:44:11 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 9654A1867 for ; Tue, 27 Apr 2021 20:44:11 +0000 (UTC) X-FDA: 78079324302.28.531C490 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id 731332000261 for ; Tue, 27 Apr 2021 20:44:11 +0000 (UTC) IronPort-SDR: /6vRjsB3F10VQ9qQACxt/k/kv9ZgT8a7v76OeeoztITgwyZ6fgVjX0agssxwLONjuVSuOq7Y9q 1BHpvYqtGkiw== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473637" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473637" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:07 -0700 IronPort-SDR: 7P8jTrq4zYxq3ucQoaAPWtoyBHLoKd1I1fXA8lTvxqFOvUSMbpYdEh7z1PHiAdeYJM27K219Id VJ/nB5nqJdKQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623429" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:06 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu Subject: [PATCH v26 02/30] x86/cet/shstk: Add Kconfig option for Shadow Stack Date: Tue, 27 Apr 2021 13:42:47 -0700 Message-Id: <20210427204315.24153-3-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 731332000261 X-Stat-Signature: 85qci8n7tong96k3sdm4ubj11knsofhc X-Rspamd-Server: rspam02 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556251-535008 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Shadow Stack provides protection against function return address corruption. It is active when the processor supports it, the kernel has CONFIG_X86_SHADOW_STACK enabled, and the application is built for the feature. This is only implemented for the 64-bit kernel. When it is enabled, legacy non-Shadow Stack applications continue to work, but without protection. Signed-off-by: Yu-cheng Yu Cc: Kees Cook --- v25: - Remove X86_CET and use X86_SHADOW_STACK directly. v24: - Update for the splitting X86_CET to X86_SHADOW_STACK and X86_IBT. arch/x86/Kconfig | 22 ++++++++++++++++++++++ arch/x86/Kconfig.assembler | 5 +++++ 2 files changed, 27 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2792879d398e..41283f82fd87 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -28,6 +28,7 @@ config X86_64 select ARCH_HAS_GIGANTIC_PAGE select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 select ARCH_USE_CMPXCHG_LOCKREF + select ARCH_HAS_SHADOW_STACK select HAVE_ARCH_SOFT_DIRTY select MODULES_USE_ELF_RELA select NEED_DMA_MAP_STATE @@ -1941,6 +1942,27 @@ config X86_SGX If unsure, say N. +config ARCH_HAS_SHADOW_STACK + def_bool n + +config X86_SHADOW_STACK + prompt "Intel Shadow Stack" + def_bool n + depends on AS_WRUSS + depends on ARCH_HAS_SHADOW_STACK + select ARCH_USES_HIGH_VMA_FLAGS + help + Shadow Stack protection is a hardware feature that detects function + return address corruption. This helps mitigate ROP attacks. + Applications must be enabled to use it, and old userspace does not + get protection "for free". + Support for this feature is present on Tiger Lake family of + processors released in 2020 or later. Enabling this feature + increases kernel text size by 3.7 KB. + See Documentation/x86/intel_cet.rst for more information. + + If unsure, say N. + config EFI bool "EFI runtime service support" depends on ACPI diff --git a/arch/x86/Kconfig.assembler b/arch/x86/Kconfig.assembler index 26b8c08e2fc4..00c79dd93651 100644 --- a/arch/x86/Kconfig.assembler +++ b/arch/x86/Kconfig.assembler @@ -19,3 +19,8 @@ config AS_TPAUSE def_bool $(as-instr,tpause %ecx) help Supported by binutils >= 2.31.1 and LLVM integrated assembler >= V7 + +config AS_WRUSS + def_bool $(as-instr,wrussq %rax$(comma)(%rbx)) + help + Supported by binutils >= 2.31 and LLVM integrated assembler From patchwork Tue Apr 27 20:42:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 440D7C43460 for ; Tue, 27 Apr 2021 20:44:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D87D76112F for ; Tue, 27 Apr 2021 20:44:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D87D76112F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2F6796B0072; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 22B426B0071; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFA6B6B0075; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0252.hostedemail.com [216.40.44.252]) by kanga.kvack.org (Postfix) with ESMTP id 6B5296B0071 for ; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1DB32181AEF00 for ; Tue, 27 Apr 2021 20:44:12 +0000 (UTC) X-FDA: 78079324344.21.73C44EC Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf11.hostedemail.com (Postfix) with ESMTP id 6616A2000242 for ; Tue, 27 Apr 2021 20:43:56 +0000 (UTC) IronPort-SDR: bKwd8lnGqH1dlvSu3E0bO+kvo/y8FGdlCfkJatzhobSFQ7YgqOU43xlGf1/1/qdce5izOJNZn7 LABZ+52v7T8A== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473640" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473640" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:08 -0700 IronPort-SDR: Bcd+EzFtUu7eKQIVxc54Wc1U0Fn8Bqa4xygyz/f6h4gezmT1DKRP6bbgibVqaSXDjPGhB+z9jg 5r6ZcRHMSd/Q== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623432" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:07 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu Subject: [PATCH v26 03/30] x86/cpufeatures: Add CET CPU feature flags for Control-flow Enforcement Technology (CET) Date: Tue, 27 Apr 2021 13:42:48 -0700 Message-Id: <20210427204315.24153-4-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 6616A2000242 X-Stat-Signature: ou76q7atry15kb1wf1hnuw4wqy564znd Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf11; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556236-348543 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add CPU feature flags for Control-flow Enforcement Technology (CET). CPUID.(EAX=7,ECX=0):ECX[bit 7] Shadow stack CPUID.(EAX=7,ECX=0):EDX[bit 20] Indirect Branch Tracking Signed-off-by: Yu-cheng Yu Cc: Kees Cook --- v25: - Make X86_FEATURE_IBT depend on X86_FEATURE_SHSTK. v24: - Update for splitting CONFIG_X86_CET to CONFIG_X86_SHADOW_STACK and CONFIG_X86_IBT. - Move DISABLE_IBT definition to the IBT series. arch/x86/include/asm/cpufeatures.h | 2 ++ arch/x86/include/asm/disabled-features.h | 8 +++++++- arch/x86/kernel/cpu/cpuid-deps.c | 2 ++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index cc96e26d69f7..bf861fc89fef 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -345,6 +345,7 @@ #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ #define X86_FEATURE_WAITPKG (16*32+ 5) /* UMONITOR/UMWAIT/TPAUSE Instructions */ #define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */ +#define X86_FEATURE_SHSTK (16*32+ 7) /* Shadow Stack */ #define X86_FEATURE_GFNI (16*32+ 8) /* Galois Field New Instructions */ #define X86_FEATURE_VAES (16*32+ 9) /* Vector AES */ #define X86_FEATURE_VPCLMULQDQ (16*32+10) /* Carry-Less Multiplication Double Quadword */ @@ -377,6 +378,7 @@ #define X86_FEATURE_TSXLDTRK (18*32+16) /* TSX Suspend Load Address Tracking */ #define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */ #define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */ +#define X86_FEATURE_IBT (18*32+20) /* Indirect Branch Tracking */ #define X86_FEATURE_AVX512_FP16 (18*32+23) /* AVX512 FP16 */ #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index b7dd944dc867..e5c6ed9373e8 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -68,6 +68,12 @@ # define DISABLE_SGX (1 << (X86_FEATURE_SGX & 31)) #endif +#ifdef CONFIG_X86_SHADOW_STACK +#define DISABLE_SHSTK 0 +#else +#define DISABLE_SHSTK (1 << (X86_FEATURE_SHSTK & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -88,7 +94,7 @@ #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \ - DISABLE_ENQCMD) + DISABLE_ENQCMD|DISABLE_SHSTK) #define DISABLED_MASK17 0 #define DISABLED_MASK18 0 #define DISABLED_MASK19 0 diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c index 42af31b64c2c..c88b7e0ebf6d 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -72,6 +72,8 @@ static const struct cpuid_dep cpuid_deps[] = { { X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW }, { X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES }, { X86_FEATURE_PER_THREAD_MBA, X86_FEATURE_MBA }, + { X86_FEATURE_SHSTK, X86_FEATURE_XSAVES }, + { X86_FEATURE_IBT, X86_FEATURE_SHSTK }, {} }; From patchwork Tue Apr 27 20:42:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227229 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E80DC433B4 for ; Tue, 27 Apr 2021 20:44:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DED0B61404 for ; Tue, 27 Apr 2021 20:44:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DED0B61404 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 02D398D0003; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E34276B0072; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9C6B6B0074; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0194.hostedemail.com [216.40.44.194]) by kanga.kvack.org (Postfix) with ESMTP id 7F5466B0072 for ; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 39C07181AEF15 for ; Tue, 27 Apr 2021 20:44:12 +0000 (UTC) X-FDA: 78079324344.14.524D49B Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf10.hostedemail.com (Postfix) with ESMTP id 73F1A40002C8 for ; Tue, 27 Apr 2021 20:44:01 +0000 (UTC) IronPort-SDR: KPMvE+CUjbNxRqQyytsqr/M3OsZtImQY8KRLfbscPltwJS7PL4gT+GJWnqrQJ0bgJ+/5woH8e/ b1STR18hmpDQ== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473643" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473643" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:08 -0700 IronPort-SDR: +tXvO61IkH4zH7pJV2sVpNWnofhwenRDVDS/7i0ig/pVnhV9EyAHWTpko3HUu3GazYRoxKx7IE tHDu0dZ9obWw== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623436" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:08 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu Subject: [PATCH v26 04/30] x86/cpufeatures: Introduce CPU setup and option parsing for CET Date: Tue, 27 Apr 2021 13:42:49 -0700 Message-Id: <20210427204315.24153-5-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 73F1A40002C8 X-Stat-Signature: 5wewq388qxdtdmnri4zuwjb4pwoga6ns Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556241-380021 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Introduce CPU setup and boot option parsing for CET features. Signed-off-by: Yu-cheng Yu Cc: Kees Cook --- v25: - Remove software-defined X86_FEATURE_CET. v24: - Update #ifdef placement to reflect Kconfig changes of splitting shadow stack and ibt. arch/x86/include/uapi/asm/processor-flags.h | 2 ++ arch/x86/kernel/cpu/common.c | 14 ++++++++++++++ 2 files changed, 16 insertions(+) diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h index bcba3c643e63..a8df907e8017 100644 --- a/arch/x86/include/uapi/asm/processor-flags.h +++ b/arch/x86/include/uapi/asm/processor-flags.h @@ -130,6 +130,8 @@ #define X86_CR4_SMAP _BITUL(X86_CR4_SMAP_BIT) #define X86_CR4_PKE_BIT 22 /* enable Protection Keys support */ #define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT) +#define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement */ +#define X86_CR4_CET _BITUL(X86_CR4_CET_BIT) /* * x86-64 Task Priority Register, CR8 diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index ab640abe26b6..b6eeb5f2ae4d 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -510,6 +510,14 @@ static __init int setup_disable_pku(char *arg) __setup("nopku", setup_disable_pku); #endif /* CONFIG_X86_64 */ +static __always_inline void setup_cet(struct cpuinfo_x86 *c) +{ + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return; + + cr4_set_bits(X86_CR4_CET); +} + /* * Some CPU features depend on higher CPUID levels, which may not always * be available due to CPUID level capping or broken virtualization @@ -1255,6 +1263,11 @@ static void __init cpu_parse_early_param(void) if (cmdline_find_option_bool(boot_command_line, "noxsaves")) setup_clear_cpu_cap(X86_FEATURE_XSAVES); + if (cmdline_find_option_bool(boot_command_line, "no_user_shstk")) + setup_clear_cpu_cap(X86_FEATURE_SHSTK); + if (cmdline_find_option_bool(boot_command_line, "no_user_ibt")) + setup_clear_cpu_cap(X86_FEATURE_IBT); + arglen = cmdline_find_option(boot_command_line, "clearcpuid", arg, sizeof(arg)); if (arglen <= 0) return; @@ -1594,6 +1607,7 @@ static void identify_cpu(struct cpuinfo_x86 *c) x86_init_rdrand(c); setup_pku(c); + setup_cet(c); /* * Clear/Set all flags overridden by options, need do it From patchwork Tue Apr 27 20:42:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227233 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4156DC433ED for ; Tue, 27 Apr 2021 20:44:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D199D6140C for ; Tue, 27 Apr 2021 20:44:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D199D6140C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 61F746B0071; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 55B4C6B0078; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 029A68D0002; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0165.hostedemail.com [216.40.44.165]) by kanga.kvack.org (Postfix) with ESMTP id C52D36B0073 for ; Tue, 27 Apr 2021 16:44:12 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 8232F52AE for ; Tue, 27 Apr 2021 20:44:12 +0000 (UTC) X-FDA: 78079324344.05.01A7567 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id 45BB22000250 for ; Tue, 27 Apr 2021 20:44:13 +0000 (UTC) IronPort-SDR: pzu2U5Z0Uoun5utL8Vp8ivS6h64y+78ukE/FqcgQdhgy+Th1jsnYusz//usSaL7gFvdrdiFFws BS+ArAyAI9IA== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473646" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473646" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:09 -0700 IronPort-SDR: zOsjio8iqqivkz+MCCXcaqMciYRCM1cHVSk2gstSRnJ8W/9x9fOlYp4Q0ykF78mqsnlCckfBAZ qhwpkqAQp9ZQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623440" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:08 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu Subject: [PATCH v26 05/30] x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states Date: Tue, 27 Apr 2021 13:42:50 -0700 Message-Id: <20210427204315.24153-6-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 45BB22000250 X-Stat-Signature: jxn6dty6saxkdpx5fj5czqbmhjwzfsu4 X-Rspamd-Server: rspam02 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556253-134977 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Control-flow Enforcement Technology (CET) introduces these MSRs: MSR_IA32_U_CET (user-mode CET settings), MSR_IA32_PL3_SSP (user-mode shadow stack pointer), MSR_IA32_PL0_SSP (kernel-mode shadow stack pointer), MSR_IA32_PL1_SSP (Privilege Level 1 shadow stack pointer), MSR_IA32_PL2_SSP (Privilege Level 2 shadow stack pointer), MSR_IA32_S_CET (kernel-mode CET settings), MSR_IA32_INT_SSP_TAB (exception shadow stack table). The two user-mode MSRs belong to XFEATURE_CET_USER. The first three of kernel-mode MSRs belong to XFEATURE_CET_KERNEL. Both XSAVES states are supervisor states. This means that there is no direct, unprivileged access to these states, making it harder for an attacker to subvert CET. For sigreturn and future ptrace() support, shadow stack address and MSR reserved bits are checked before written to the supervisor states. Signed-off-by: Yu-cheng Yu Cc: Kees Cook --- v25: - Update xsave_cpuid_features[]. Now CET XSAVES features depend on X86_FEATURE_SHSTK (vs. the software-defined X86_FEATURE_CET). arch/x86/include/asm/fpu/types.h | 23 +++++++++++++++++++++-- arch/x86/include/asm/fpu/xstate.h | 6 ++++-- arch/x86/include/asm/msr-index.h | 19 +++++++++++++++++++ arch/x86/kernel/fpu/xstate.c | 10 +++++++++- 4 files changed, 53 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index f5a38a5f3ae1..035eb0ec665e 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -115,8 +115,8 @@ enum xfeature { XFEATURE_PT_UNIMPLEMENTED_SO_FAR, XFEATURE_PKRU, XFEATURE_PASID, - XFEATURE_RSRVD_COMP_11, - XFEATURE_RSRVD_COMP_12, + XFEATURE_CET_USER, + XFEATURE_CET_KERNEL, XFEATURE_RSRVD_COMP_13, XFEATURE_RSRVD_COMP_14, XFEATURE_LBR, @@ -135,6 +135,8 @@ enum xfeature { #define XFEATURE_MASK_PT (1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR) #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) #define XFEATURE_MASK_PASID (1 << XFEATURE_PASID) +#define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL) #define XFEATURE_MASK_LBR (1 << XFEATURE_LBR) #define XFEATURE_MASK_FPSSE (XFEATURE_MASK_FP | XFEATURE_MASK_SSE) @@ -237,6 +239,23 @@ struct pkru_state { u32 pad; } __packed; +/* + * State component 11 is Control-flow Enforcement user states + */ +struct cet_user_state { + u64 user_cet; /* user control-flow settings */ + u64 user_ssp; /* user shadow stack pointer */ +}; + +/* + * State component 12 is Control-flow Enforcement kernel states + */ +struct cet_kernel_state { + u64 kernel_ssp; /* kernel shadow stack */ + u64 pl1_ssp; /* privilege level 1 shadow stack */ + u64 pl2_ssp; /* privilege level 2 shadow stack */ +}; + /* * State component 15: Architectural LBR configuration state. * The size of Arch LBR state depends on the number of LBRs (lbr_depth). diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index 47a92232d595..582f3575e0bd 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -35,7 +35,8 @@ XFEATURE_MASK_BNDCSR) /* All currently supported supervisor features */ -#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID) +#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ + XFEATURE_MASK_CET_USER) /* * A supervisor state component may not always contain valuable information, @@ -62,7 +63,8 @@ * Unsupported supervisor features. When a supervisor feature in this mask is * supported in the future, move it to the supported supervisor feature mask. */ -#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT) +#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT | \ + XFEATURE_MASK_CET_KERNEL) /* All supervisor states including supported and unsupported states. */ #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 546d6ecf0a35..5f4b7edead0b 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -933,4 +933,23 @@ #define MSR_VM_IGNNE 0xc0010115 #define MSR_VM_HSAVE_PA 0xc0010117 +/* Control-flow Enforcement Technology MSRs */ +#define MSR_IA32_U_CET 0x000006a0 /* user mode cet setting */ +#define MSR_IA32_S_CET 0x000006a2 /* kernel mode cet setting */ +#define CET_SHSTK_EN BIT_ULL(0) +#define CET_WRSS_EN BIT_ULL(1) +#define CET_ENDBR_EN BIT_ULL(2) +#define CET_LEG_IW_EN BIT_ULL(3) +#define CET_NO_TRACK_EN BIT_ULL(4) +#define CET_SUPPRESS_DISABLE BIT_ULL(5) +#define CET_RESERVED (BIT_ULL(6) | BIT_ULL(7) | BIT_ULL(8) | BIT_ULL(9)) +#define CET_SUPPRESS BIT_ULL(10) +#define CET_WAIT_ENDBR BIT_ULL(11) + +#define MSR_IA32_PL0_SSP 0x000006a4 /* kernel shadow stack pointer */ +#define MSR_IA32_PL1_SSP 0x000006a5 /* ring-1 shadow stack pointer */ +#define MSR_IA32_PL2_SSP 0x000006a6 /* ring-2 shadow stack pointer */ +#define MSR_IA32_PL3_SSP 0x000006a7 /* user shadow stack pointer */ +#define MSR_IA32_INT_SSP_TAB 0x000006a8 /* exception shadow stack table */ + #endif /* _ASM_X86_MSR_INDEX_H */ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 683749b80ae2..64477b527019 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -38,6 +38,8 @@ static const char *xfeature_names[] = "Processor Trace (unused)" , "Protection Keys User registers", "PASID state", + "Control-flow User registers" , + "Control-flow Kernel registers" , "unknown xstate feature" , }; @@ -53,6 +55,8 @@ static short xsave_cpuid_features[] __initdata = { X86_FEATURE_INTEL_PT, X86_FEATURE_PKU, X86_FEATURE_ENQCMD, + X86_FEATURE_SHSTK, /* XFEATURE_CET_USER */ + X86_FEATURE_SHSTK, /* XFEATURE_CET_KERNEL */ }; /* @@ -321,6 +325,8 @@ static void __init print_xstate_features(void) print_xstate_feature(XFEATURE_MASK_Hi16_ZMM); print_xstate_feature(XFEATURE_MASK_PKRU); print_xstate_feature(XFEATURE_MASK_PASID); + print_xstate_feature(XFEATURE_MASK_CET_USER); + print_xstate_feature(XFEATURE_MASK_CET_KERNEL); } /* @@ -596,6 +602,8 @@ static void check_xstate_against_struct(int nr) XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM, struct avx_512_hi16_state); XCHECK_SZ(sz, nr, XFEATURE_PKRU, struct pkru_state); XCHECK_SZ(sz, nr, XFEATURE_PASID, struct ia32_pasid_state); + XCHECK_SZ(sz, nr, XFEATURE_CET_USER, struct cet_user_state); + XCHECK_SZ(sz, nr, XFEATURE_CET_KERNEL, struct cet_kernel_state); /* * Make *SURE* to add any feature numbers in below if @@ -605,7 +613,7 @@ static void check_xstate_against_struct(int nr) if ((nr < XFEATURE_YMM) || (nr >= XFEATURE_MAX) || (nr == XFEATURE_PT_UNIMPLEMENTED_SO_FAR) || - ((nr >= XFEATURE_RSRVD_COMP_11) && (nr <= XFEATURE_LBR))) { + ((nr >= XFEATURE_RSRVD_COMP_13) && (nr <= XFEATURE_LBR))) { WARN_ONCE(1, "no structure for xstate: %d\n", nr); XSTATE_WARN_ON(1); } From patchwork Tue Apr 27 20:42:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC33EC43461 for ; Tue, 27 Apr 2021 20:44:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9D1E6613B0 for ; Tue, 27 Apr 2021 20:44:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9D1E6613B0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EB1C66B0078; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DC1CB6B007B; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 951686B0078; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0192.hostedemail.com [216.40.44.192]) by kanga.kvack.org (Postfix) with ESMTP id 3BCE06B0073 for ; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) Received: from smtpin40.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 02AF64417 for ; Tue, 27 Apr 2021 20:44:12 +0000 (UTC) X-FDA: 78079324386.40.C8A1328 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf11.hostedemail.com (Postfix) with ESMTP id 4EF5F200025C for ; Tue, 27 Apr 2021 20:43:57 +0000 (UTC) IronPort-SDR: WcrR3bGLUmLJLL5Lt115oSt9DM7Rr7sYwCKR9eOvTw5+oKg+dicTwanI6a8o2H83SaIf1mYA1u JnuIiezJZ8IA== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473649" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473649" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:10 -0700 IronPort-SDR: j6jj2Bo1P7nFi54Wldg40UuFycj2vNuL3SpeAQ3d/iWRP+SPOkzJM+4DqaKOt1iCfbEIf3P9Fe xdnaBuY7+rrg== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623446" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:09 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , Michael Kerrisk Subject: [PATCH v26 06/30] x86/cet: Add control-protection fault handler Date: Tue, 27 Apr 2021 13:42:51 -0700 Message-Id: <20210427204315.24153-7-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 4EF5F200025C X-Stat-Signature: i6deg5iumrgnr1ng6uxzdz1sbedu6d75 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf11; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556237-802741 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A control-protection fault is triggered when a control-flow transfer attempt violates Shadow Stack or Indirect Branch Tracking constraints. For example, the return address for a RET instruction differs from the copy on the shadow stack; or an indirect JMP instruction, without the NOTRACK prefix, arrives at a non-ENDBR opcode. The control-protection fault handler works in a similar way as the general protection fault handler. It provides the si_code SEGV_CPERR to the signal handler. Signed-off-by: Yu-cheng Yu Cc: Kees Cook Cc: Michael Kerrisk --- v25: - Change CONFIG_X86_CET to CONFIG_X86_SHADOW_STACK. - Change X86_FEATURE_CET to X86_FEATURE_SHSTK. arch/x86/include/asm/idtentry.h | 4 ++ arch/x86/kernel/idt.c | 4 ++ arch/x86/kernel/signal_compat.c | 2 +- arch/x86/kernel/traps.c | 63 ++++++++++++++++++++++++++++++ include/uapi/asm-generic/siginfo.h | 3 +- 5 files changed, 74 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index 5eb3bdf36a41..5791c02864ec 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -571,6 +571,10 @@ DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_SS, exc_stack_segment); DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_GP, exc_general_protection); DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_AC, exc_alignment_check); +#ifdef CONFIG_X86_SHADOW_STACK +DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_CP, exc_control_protection); +#endif + /* Raw exception entries which need extra work */ DECLARE_IDTENTRY_RAW(X86_TRAP_UD, exc_invalid_op); DECLARE_IDTENTRY_RAW(X86_TRAP_BP, exc_int3); diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c index ee1a283f8e96..0315fb297dd3 100644 --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -105,6 +105,10 @@ static const __initconst struct idt_data def_idts[] = { #elif defined(CONFIG_X86_32) SYSG(IA32_SYSCALL_VECTOR, entry_INT80_32), #endif + +#ifdef CONFIG_X86_SHADOW_STACK + INTG(X86_TRAP_CP, asm_exc_control_protection), +#endif }; /* diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c index a5330ff498f0..dd92490b1e7f 100644 --- a/arch/x86/kernel/signal_compat.c +++ b/arch/x86/kernel/signal_compat.c @@ -27,7 +27,7 @@ static inline void signal_compat_build_tests(void) */ BUILD_BUG_ON(NSIGILL != 11); BUILD_BUG_ON(NSIGFPE != 15); - BUILD_BUG_ON(NSIGSEGV != 9); + BUILD_BUG_ON(NSIGSEGV != 10); BUILD_BUG_ON(NSIGBUS != 5); BUILD_BUG_ON(NSIGTRAP != 5); BUILD_BUG_ON(NSIGCHLD != 6); diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 651e3e508959..a40b34b09400 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include @@ -606,6 +607,68 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) cond_local_irq_disable(regs); } +#ifdef CONFIG_X86_SHADOW_STACK +static const char * const control_protection_err[] = { + "unknown", + "near-ret", + "far-ret/iret", + "endbranch", + "rstorssp", + "setssbsy", + "unknown", +}; + +static DEFINE_RATELIMIT_STATE(cpf_rate, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + +/* + * When a control protection exception occurs, send a signal to the responsible + * application. Currently, control protection is only enabled for user mode. + * This exception should not come from kernel mode. + */ +DEFINE_IDTENTRY_ERRORCODE(exc_control_protection) +{ + struct task_struct *tsk; + + if (!user_mode(regs)) { + pr_emerg("PANIC: unexpected kernel control protection fault\n"); + die("kernel control protection fault", regs, error_code); + panic("Machine halted."); + } + + cond_local_irq_enable(regs); + + if (!boot_cpu_has(X86_FEATURE_SHSTK)) + WARN_ONCE(1, "Control protection fault with CET support disabled\n"); + + tsk = current; + tsk->thread.error_code = error_code; + tsk->thread.trap_nr = X86_TRAP_CP; + + /* + * Ratelimit to prevent log spamming. + */ + if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) && + __ratelimit(&cpf_rate)) { + unsigned long ssp; + int cpf_type; + + cpf_type = array_index_nospec(error_code, ARRAY_SIZE(control_protection_err)); + + rdmsrl(MSR_IA32_PL3_SSP, ssp); + pr_emerg("%s[%d] control protection ip:%lx sp:%lx ssp:%lx error:%lx(%s)", + tsk->comm, task_pid_nr(tsk), + regs->ip, regs->sp, ssp, error_code, + control_protection_err[cpf_type]); + print_vma_addr(KERN_CONT " in ", regs->ip); + pr_cont("\n"); + } + + force_sig_fault(SIGSEGV, SEGV_CPERR, (void __user *)0); + cond_local_irq_disable(regs); +} +#endif + static bool do_int3(struct pt_regs *regs) { int res; diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index d2597000407a..1c2ea91284a0 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -231,7 +231,8 @@ typedef struct siginfo { #define SEGV_ADIPERR 7 /* Precise MCD exception */ #define SEGV_MTEAERR 8 /* Asynchronous ARM MTE error */ #define SEGV_MTESERR 9 /* Synchronous ARM MTE exception */ -#define NSIGSEGV 9 +#define SEGV_CPERR 10 /* Control protection fault */ +#define NSIGSEGV 10 /* * SIGBUS si_codes From patchwork Tue Apr 27 20:42:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227237 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BE0BC43618 for ; Tue, 27 Apr 2021 20:44:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C29F860720 for ; Tue, 27 Apr 2021 20:44:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C29F860720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B2C7B6B0074; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A65AB6B0075; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 759DF6B007B; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0099.hostedemail.com [216.40.44.99]) by kanga.kvack.org (Postfix) with ESMTP id 3DC096B0074 for ; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id F3DB1246E for ; Tue, 27 Apr 2021 20:44:12 +0000 (UTC) X-FDA: 78079324386.11.8FB34F5 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf10.hostedemail.com (Postfix) with ESMTP id 2B26940002F7 for ; Tue, 27 Apr 2021 20:44:02 +0000 (UTC) IronPort-SDR: TLrllzjFJ9EfJpBg6ebKcC44v5U8qxrFD2kNbPU0tN76jNf65ZNtc+0HerzjLFhlH7GLndnRIi yMHAkVA7aNHg== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473652" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473652" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:10 -0700 IronPort-SDR: xGVQoIBCa5wVQaRj17c3pGosd4YAlB9P5BSnsIvO51vRFOH3QlAffJhWv6WnbCmr426HDlEOCX bKzQW3tyUGSw== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623455" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:10 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" , Christoph Hellwig Subject: [PATCH v26 07/30] x86/mm: Remove _PAGE_DIRTY from kernel RO pages Date: Tue, 27 Apr 2021 13:42:52 -0700 Message-Id: <20210427204315.24153-8-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 2B26940002F7 X-Stat-Signature: o7makqby99d755ygz89fk9z71s9ukpeo Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556242-40490 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The x86 family of processors do not directly create read-only and Dirty PTEs. These PTEs are created by software. One such case is that kernel read-only pages are historically setup as Dirty. New processors that support Shadow Stack regard read-only and Dirty PTEs as shadow stack pages. This results in ambiguity between shadow stack and kernel read-only pages. To resolve this, removed Dirty from kernel read- only pages. Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov Cc: "H. Peter Anvin" Cc: Kees Cook Cc: Thomas Gleixner Cc: Dave Hansen Cc: Christoph Hellwig Cc: Andy Lutomirski Cc: Ingo Molnar Cc: Borislav Petkov Cc: Peter Zijlstra --- arch/x86/include/asm/pgtable_types.h | 6 +++--- arch/x86/mm/pat/set_memory.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index f24d7ef8fffa..9db61817dfff 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -192,10 +192,10 @@ enum page_cache_mode { #define _KERNPG_TABLE (__PP|__RW| 0|___A| 0|___D| 0| 0| _ENC) #define _PAGE_TABLE_NOENC (__PP|__RW|_USR|___A| 0|___D| 0| 0) #define _PAGE_TABLE (__PP|__RW|_USR|___A| 0|___D| 0| 0| _ENC) -#define __PAGE_KERNEL_RO (__PP| 0| 0|___A|__NX|___D| 0|___G) -#define __PAGE_KERNEL_ROX (__PP| 0| 0|___A| 0|___D| 0|___G) +#define __PAGE_KERNEL_RO (__PP| 0| 0|___A|__NX| 0| 0|___G) +#define __PAGE_KERNEL_ROX (__PP| 0| 0|___A| 0| 0| 0|___G) #define __PAGE_KERNEL_NOCACHE (__PP|__RW| 0|___A|__NX|___D| 0|___G| __NC) -#define __PAGE_KERNEL_VVAR (__PP| 0|_USR|___A|__NX|___D| 0|___G) +#define __PAGE_KERNEL_VVAR (__PP| 0|_USR|___A|__NX| 0| 0|___G) #define __PAGE_KERNEL_LARGE (__PP|__RW| 0|___A|__NX|___D|_PSE|___G) #define __PAGE_KERNEL_LARGE_EXEC (__PP|__RW| 0|___A| 0|___D|_PSE|___G) #define __PAGE_KERNEL_WP (__PP|__RW| 0|___A|__NX|___D| 0|___G| __WP) diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 16f878c26667..6bebb95a6988 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -1932,7 +1932,7 @@ int set_memory_nx(unsigned long addr, int numpages) int set_memory_ro(unsigned long addr, int numpages) { - return change_page_attr_clear(&addr, numpages, __pgprot(_PAGE_RW), 0); + return change_page_attr_clear(&addr, numpages, __pgprot(_PAGE_RW | _PAGE_DIRTY), 0); } int set_memory_rw(unsigned long addr, int numpages) From patchwork Tue Apr 27 20:42:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A638C4363F for ; Tue, 27 Apr 2021 20:44:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C1DFE61410 for ; Tue, 27 Apr 2021 20:44:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C1DFE61410 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1CB8C6B0073; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 13E786B0080; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B27BD6B0073; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0179.hostedemail.com [216.40.44.179]) by kanga.kvack.org (Postfix) with ESMTP id 8B9E36B0074 for ; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 379B04DDF for ; Tue, 27 Apr 2021 20:44:13 +0000 (UTC) X-FDA: 78079324386.34.9669D78 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id 1F9C22000250 for ; Tue, 27 Apr 2021 20:44:13 +0000 (UTC) IronPort-SDR: TaXgQRAzLnrr3UHbuCGbNNmp1MAtprnb1DyR7SGfUUSp5YA2kwHdTpLqsbDTaXYCcXc3mY71uW ifgH7N0M9yfQ== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473655" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473655" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:11 -0700 IronPort-SDR: aUWQgeFi1WwBEOmzfnlYum8c6FP6jdDZZT8yZvx27Erb1lYmKik6XfXYOEqGM4ZSX5Cr8vjhMm /VzHF4qwLtgQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623460" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:10 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 08/30] x86/mm: Move pmd_write(), pud_write() up in the file Date: Tue, 27 Apr 2021 13:42:53 -0700 Message-Id: <20210427204315.24153-9-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 1F9C22000250 X-Stat-Signature: hidxnsd4bmrmwg5u91fa99omeyz5drwh X-Rspamd-Server: rspam02 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556253-425803 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: To prepare the introduction of _PAGE_COW, move pmd_write() and pud_write() up in the file, so that they can be used by other helpers below. Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov --- arch/x86/include/asm/pgtable.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index a02c67291cfc..c1650d0af1b5 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -185,6 +185,18 @@ static inline int pte_write(pte_t pte) return pte_flags(pte) & _PAGE_RW; } +#define pmd_write pmd_write +static inline int pmd_write(pmd_t pmd) +{ + return pmd_flags(pmd) & _PAGE_RW; +} + +#define pud_write pud_write +static inline int pud_write(pud_t pud) +{ + return pud_flags(pud) & _PAGE_RW; +} + static inline int pte_huge(pte_t pte) { return pte_flags(pte) & _PAGE_PSE; @@ -1128,12 +1140,6 @@ extern int pmdp_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); -#define pmd_write pmd_write -static inline int pmd_write(pmd_t pmd) -{ - return pmd_flags(pmd) & _PAGE_RW; -} - #define __HAVE_ARCH_PMDP_HUGE_GET_AND_CLEAR static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) @@ -1155,12 +1161,6 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm, clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp); } -#define pud_write pud_write -static inline int pud_write(pud_t pud) -{ - return pud_flags(pud) & _PAGE_RW; -} - #ifndef pmdp_establish #define pmdp_establish pmdp_establish static inline pmd_t pmdp_establish(struct vm_area_struct *vma, From patchwork Tue Apr 27 20:42:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227243 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4041DC43470 for ; Tue, 27 Apr 2021 20:44:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C1F066140E for ; Tue, 27 Apr 2021 20:44:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C1F066140E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 759E46B007D; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 61D7F6B0080; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AE256B007D; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0222.hostedemail.com [216.40.44.222]) by kanga.kvack.org (Postfix) with ESMTP id E6BE16B0075 for ; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id AB788824999B for ; Tue, 27 Apr 2021 20:44:13 +0000 (UTC) X-FDA: 78079324386.22.F2539C8 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf11.hostedemail.com (Postfix) with ESMTP id 17F7F2000247 for ; Tue, 27 Apr 2021 20:43:57 +0000 (UTC) IronPort-SDR: JZUjVUSbtnRiWFiZ+jxdmDCdjd50OS3gIvd6VptgyzkDKH90CV2dUTRL4RFuv1l76QxrBc396Y ii96bvrwNM3w== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473658" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473658" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:12 -0700 IronPort-SDR: YEqKw5Clm46VkyL7SMI1DkZfYOGTKDUy8ls0zTVWqGqnGjcJ6L7LStffc1UqIvFtwlFZCfNm1p Yrj/IZidempA== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623466" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:11 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 09/30] x86/mm: Introduce _PAGE_COW Date: Tue, 27 Apr 2021 13:42:54 -0700 Message-Id: <20210427204315.24153-10-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 17F7F2000247 X-Stat-Signature: cps9z1znh3gcq8xu83y6whmy81xtptha Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf11; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556237-492784 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There is essentially no room left in the x86 hardware PTEs on some OSes (not Linux). That left the hardware architects looking for a way to represent a new memory type (shadow stack) within the existing bits. They chose to repurpose a lightly-used state: Write=0, Dirty=1. The reason it's lightly used is that Dirty=1 is normally set by hardware and cannot normally be set by hardware on a Write=0 PTE. Software must normally be involved to create one of these PTEs, so software can simply opt to not create them. In places where Linux normally creates Write=0, Dirty=1, it can use the software-defined _PAGE_COW in place of the hardware _PAGE_DIRTY. In other words, whenever Linux needs to create Write=0, Dirty=1, it instead creates Write=0, Cow=1, except for shadow stack, which is Write=0, Dirty=1. This clearly separates shadow stack from other data, and results in the following: (a) A modified, copy-on-write (COW) page: (Write=0, Cow=1) (b) A R/O page that has been COW'ed: (Write=0, Cow=1) The user page is in a R/O VMA, and get_user_pages() needs a writable copy. The page fault handler creates a copy of the page and sets the new copy's PTE as Write=0 and Cow=1. (c) A shadow stack PTE: (Write=0, Dirty=1) (d) A shared shadow stack PTE: (Write=0, Cow=1) When a shadow stack page is being shared among processes (this happens at fork()), its PTE is made Dirty=0, so the next shadow stack access causes a fault, and the page is duplicated and Dirty=1 is set again. This is the COW equivalent for shadow stack pages, even though it's copy-on-access rather than copy-on-write. (e) A page where the processor observed a Write=1 PTE, started a write, set Dirty=1, but then observed a Write=0 PTE. That's possible today, but will not happen on processors that support shadow stack. Define _PAGE_COW and update pte_*() helpers and apply the same changes to pmd and pud. After this, there are six free bits left in the 64-bit PTE, and no more free bits in the 32-bit PTE (except for PAE) and Shadow Stack is not implemented for the 32-bit kernel. Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov --- v24: - Replace CONFIG_X86_CET with CONFIG_X86_SHADOW_STACK, to reflect the Kconfig changes. arch/x86/include/asm/pgtable.h | 195 ++++++++++++++++++++++++--- arch/x86/include/asm/pgtable_types.h | 42 +++++- 2 files changed, 216 insertions(+), 21 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index c1650d0af1b5..9c056d5815de 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -121,11 +121,21 @@ extern pmdval_t early_pmd_flags; * The following only work if pte_present() is true. * Undefined behaviour if not.. */ -static inline int pte_dirty(pte_t pte) +static inline bool pte_dirty(pte_t pte) { - return pte_flags(pte) & _PAGE_DIRTY; + /* + * A dirty PTE has Dirty=1 or Cow=1. + */ + return pte_flags(pte) & _PAGE_DIRTY_BITS; } +static inline bool pte_shstk(pte_t pte) +{ + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return false; + + return (pte_flags(pte) & (_PAGE_RW | _PAGE_DIRTY)) == _PAGE_DIRTY; +} static inline u32 read_pkru(void) { @@ -160,9 +170,20 @@ static inline int pte_young(pte_t pte) return pte_flags(pte) & _PAGE_ACCESSED; } -static inline int pmd_dirty(pmd_t pmd) +static inline bool pmd_dirty(pmd_t pmd) +{ + /* + * A dirty PMD has Dirty=1 or Cow=1. + */ + return pmd_flags(pmd) & _PAGE_DIRTY_BITS; +} + +static inline bool pmd_shstk(pmd_t pmd) { - return pmd_flags(pmd) & _PAGE_DIRTY; + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return false; + + return (pmd_flags(pmd) & (_PAGE_RW | _PAGE_DIRTY)) == _PAGE_DIRTY; } static inline int pmd_young(pmd_t pmd) @@ -170,9 +191,12 @@ static inline int pmd_young(pmd_t pmd) return pmd_flags(pmd) & _PAGE_ACCESSED; } -static inline int pud_dirty(pud_t pud) +static inline bool pud_dirty(pud_t pud) { - return pud_flags(pud) & _PAGE_DIRTY; + /* + * A dirty PUD has Dirty=1 or Cow=1. + */ + return pud_flags(pud) & _PAGE_DIRTY_BITS; } static inline int pud_young(pud_t pud) @@ -182,13 +206,23 @@ static inline int pud_young(pud_t pud) static inline int pte_write(pte_t pte) { - return pte_flags(pte) & _PAGE_RW; + /* + * Shadow stack pages are always writable - but not by normal + * instructions, and only by shadow stack operations. Therefore, + * the W=0,D=1 test with pte_shstk(). + */ + return (pte_flags(pte) & _PAGE_RW) || pte_shstk(pte); } #define pmd_write pmd_write static inline int pmd_write(pmd_t pmd) { - return pmd_flags(pmd) & _PAGE_RW; + /* + * Shadow stack pages are always writable - but not by normal + * instructions, and only by shadow stack operations. Therefore, + * the W=0,D=1 test with pmd_shstk(). + */ + return (pmd_flags(pmd) & _PAGE_RW) || pmd_shstk(pmd); } #define pud_write pud_write @@ -326,6 +360,24 @@ static inline pte_t pte_clear_flags(pte_t pte, pteval_t clear) return native_make_pte(v & ~clear); } +static inline pte_t pte_mkcow(pte_t pte) +{ + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return pte; + + pte = pte_clear_flags(pte, _PAGE_DIRTY); + return pte_set_flags(pte, _PAGE_COW); +} + +static inline pte_t pte_clear_cow(pte_t pte) +{ + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return pte; + + pte = pte_set_flags(pte, _PAGE_DIRTY); + return pte_clear_flags(pte, _PAGE_COW); +} + #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP static inline int pte_uffd_wp(pte_t pte) { @@ -345,7 +397,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte) static inline pte_t pte_mkclean(pte_t pte) { - return pte_clear_flags(pte, _PAGE_DIRTY); + return pte_clear_flags(pte, _PAGE_DIRTY_BITS); } static inline pte_t pte_mkold(pte_t pte) @@ -355,7 +407,16 @@ static inline pte_t pte_mkold(pte_t pte) static inline pte_t pte_wrprotect(pte_t pte) { - return pte_clear_flags(pte, _PAGE_RW); + pte = pte_clear_flags(pte, _PAGE_RW); + + /* + * Blindly clearing _PAGE_RW might accidentally create + * a shadow stack PTE (RW=0, Dirty=1). Move the hardware + * dirty value to the software bit. + */ + if (pte_dirty(pte)) + pte = pte_mkcow(pte); + return pte; } static inline pte_t pte_mkexec(pte_t pte) @@ -365,7 +426,18 @@ static inline pte_t pte_mkexec(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte) { - return pte_set_flags(pte, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + pteval_t dirty = _PAGE_DIRTY; + + /* Avoid creating (HW)Dirty=1, Write=0 PTEs */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK) && !pte_write(pte)) + dirty = _PAGE_COW; + + return pte_set_flags(pte, dirty | _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_mkwrite_shstk(pte_t pte) +{ + return pte_clear_cow(pte); } static inline pte_t pte_mkyoung(pte_t pte) @@ -375,7 +447,12 @@ static inline pte_t pte_mkyoung(pte_t pte) static inline pte_t pte_mkwrite(pte_t pte) { - return pte_set_flags(pte, _PAGE_RW); + pte = pte_set_flags(pte, _PAGE_RW); + + if (pte_dirty(pte)) + pte = pte_clear_cow(pte); + + return pte; } static inline pte_t pte_mkhuge(pte_t pte) @@ -422,6 +499,24 @@ static inline pmd_t pmd_clear_flags(pmd_t pmd, pmdval_t clear) return native_make_pmd(v & ~clear); } +static inline pmd_t pmd_mkcow(pmd_t pmd) +{ + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return pmd; + + pmd = pmd_clear_flags(pmd, _PAGE_DIRTY); + return pmd_set_flags(pmd, _PAGE_COW); +} + +static inline pmd_t pmd_clear_cow(pmd_t pmd) +{ + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return pmd; + + pmd = pmd_set_flags(pmd, _PAGE_DIRTY); + return pmd_clear_flags(pmd, _PAGE_COW); +} + #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP static inline int pmd_uffd_wp(pmd_t pmd) { @@ -446,17 +541,36 @@ static inline pmd_t pmd_mkold(pmd_t pmd) static inline pmd_t pmd_mkclean(pmd_t pmd) { - return pmd_clear_flags(pmd, _PAGE_DIRTY); + return pmd_clear_flags(pmd, _PAGE_DIRTY_BITS); } static inline pmd_t pmd_wrprotect(pmd_t pmd) { - return pmd_clear_flags(pmd, _PAGE_RW); + pmd = pmd_clear_flags(pmd, _PAGE_RW); + /* + * Blindly clearing _PAGE_RW might accidentally create + * a shadow stack PMD (RW=0, Dirty=1). Move the hardware + * dirty value to the software bit. + */ + if (pmd_dirty(pmd)) + pmd = pmd_mkcow(pmd); + return pmd; } static inline pmd_t pmd_mkdirty(pmd_t pmd) { - return pmd_set_flags(pmd, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + pmdval_t dirty = _PAGE_DIRTY; + + /* Avoid creating (HW)Dirty=1, Write=0 PMDs */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK) && !pmd_write(pmd)) + dirty = _PAGE_COW; + + return pmd_set_flags(pmd, dirty | _PAGE_SOFT_DIRTY); +} + +static inline pmd_t pmd_mkwrite_shstk(pmd_t pmd) +{ + return pmd_clear_cow(pmd); } static inline pmd_t pmd_mkdevmap(pmd_t pmd) @@ -476,7 +590,11 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd) static inline pmd_t pmd_mkwrite(pmd_t pmd) { - return pmd_set_flags(pmd, _PAGE_RW); + pmd = pmd_set_flags(pmd, _PAGE_RW); + + if (pmd_dirty(pmd)) + pmd = pmd_clear_cow(pmd); + return pmd; } static inline pud_t pud_set_flags(pud_t pud, pudval_t set) @@ -493,6 +611,24 @@ static inline pud_t pud_clear_flags(pud_t pud, pudval_t clear) return native_make_pud(v & ~clear); } +static inline pud_t pud_mkcow(pud_t pud) +{ + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return pud; + + pud = pud_clear_flags(pud, _PAGE_DIRTY); + return pud_set_flags(pud, _PAGE_COW); +} + +static inline pud_t pud_clear_cow(pud_t pud) +{ + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return pud; + + pud = pud_set_flags(pud, _PAGE_DIRTY); + return pud_clear_flags(pud, _PAGE_COW); +} + static inline pud_t pud_mkold(pud_t pud) { return pud_clear_flags(pud, _PAGE_ACCESSED); @@ -500,17 +636,32 @@ static inline pud_t pud_mkold(pud_t pud) static inline pud_t pud_mkclean(pud_t pud) { - return pud_clear_flags(pud, _PAGE_DIRTY); + return pud_clear_flags(pud, _PAGE_DIRTY_BITS); } static inline pud_t pud_wrprotect(pud_t pud) { - return pud_clear_flags(pud, _PAGE_RW); + pud = pud_clear_flags(pud, _PAGE_RW); + + /* + * Blindly clearing _PAGE_RW might accidentally create + * a shadow stack PUD (RW=0, Dirty=1). Move the hardware + * dirty value to the software bit. + */ + if (pud_dirty(pud)) + pud = pud_mkcow(pud); + return pud; } static inline pud_t pud_mkdirty(pud_t pud) { - return pud_set_flags(pud, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + pudval_t dirty = _PAGE_DIRTY; + + /* Avoid creating (HW)Dirty=1, Write=0 PUDs */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK) && !pud_write(pud)) + dirty = _PAGE_COW; + + return pud_set_flags(pud, dirty | _PAGE_SOFT_DIRTY); } static inline pud_t pud_mkdevmap(pud_t pud) @@ -530,7 +681,11 @@ static inline pud_t pud_mkyoung(pud_t pud) static inline pud_t pud_mkwrite(pud_t pud) { - return pud_set_flags(pud, _PAGE_RW); + pud = pud_set_flags(pud, _PAGE_RW); + + if (pud_dirty(pud)) + pud = pud_clear_cow(pud); + return pud; } #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 9db61817dfff..ce853c28c253 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -23,7 +23,8 @@ #define _PAGE_BIT_SOFTW2 10 /* " */ #define _PAGE_BIT_SOFTW3 11 /* " */ #define _PAGE_BIT_PAT_LARGE 12 /* On 2MB or 1GB pages */ -#define _PAGE_BIT_SOFTW4 58 /* available for programmer */ +#define _PAGE_BIT_SOFTW4 57 /* available for programmer */ +#define _PAGE_BIT_SOFTW5 58 /* available for programmer */ #define _PAGE_BIT_PKEY_BIT0 59 /* Protection Keys, bit 1/4 */ #define _PAGE_BIT_PKEY_BIT1 60 /* Protection Keys, bit 2/4 */ #define _PAGE_BIT_PKEY_BIT2 61 /* Protection Keys, bit 3/4 */ @@ -36,6 +37,15 @@ #define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */ #define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4 +/* + * Indicates a copy-on-write page. + */ +#ifdef CONFIG_X86_SHADOW_STACK +#define _PAGE_BIT_COW _PAGE_BIT_SOFTW5 /* copy-on-write */ +#else +#define _PAGE_BIT_COW 0 +#endif + /* If _PAGE_BIT_PRESENT is clear, we use these: */ /* - if the user mapped it with PROT_NONE; pte_present gives true */ #define _PAGE_BIT_PROTNONE _PAGE_BIT_GLOBAL @@ -117,6 +127,36 @@ #define _PAGE_DEVMAP (_AT(pteval_t, 0)) #endif +/* + * The hardware requires shadow stack to be read-only and Dirty. + * _PAGE_COW is a software-only bit used to separate copy-on-write PTEs + * from shadow stack PTEs: + * (a) A modified, copy-on-write (COW) page: (Write=0, Cow=1) + * (b) A R/O page that has been COW'ed: (Write=0, Cow=1) + * The user page is in a R/O VMA, and get_user_pages() needs a + * writable copy. The page fault handler creates a copy of the page + * and sets the new copy's PTE as Write=0, Cow=1. + * (c) A shadow stack PTE: (Write=0, Dirty=1) + * (d) A shared (copy-on-access) shadow stack PTE: (Write=0, Cow=1) + * When a shadow stack page is being shared among processes (this + * happens at fork()), its PTE is cleared of _PAGE_DIRTY, so the next + * shadow stack access causes a fault, and the page is duplicated and + * _PAGE_DIRTY is set again. This is the COW equivalent for shadow + * stack pages, even though it's copy-on-access rather than + * copy-on-write. + * (e) A page where the processor observed a Write=1 PTE, started a write, + * set Dirty=1, but then observed a Write=0 PTE (changed by another + * thread). That's possible today, but will not happen on processors + * that support shadow stack. + */ +#ifdef CONFIG_X86_SHADOW_STACK +#define _PAGE_COW (_AT(pteval_t, 1) << _PAGE_BIT_COW) +#else +#define _PAGE_COW (_AT(pteval_t, 0)) +#endif + +#define _PAGE_DIRTY_BITS (_PAGE_DIRTY | _PAGE_COW) + #define _PAGE_PROTNONE (_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE) /* From patchwork Tue Apr 27 20:42:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 558D8C433B4 for ; Tue, 27 Apr 2021 20:44:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DF43C61410 for ; Tue, 27 Apr 2021 20:44:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DF43C61410 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A50B96B0075; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C5756B007E; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 756726B0075; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0132.hostedemail.com [216.40.44.132]) by kanga.kvack.org (Postfix) with ESMTP id E285B6B007E for ; Tue, 27 Apr 2021 16:44:13 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A7C958249980 for ; Tue, 27 Apr 2021 20:44:13 +0000 (UTC) X-FDA: 78079324386.11.63E78E7 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf10.hostedemail.com (Postfix) with ESMTP id DB7CA40002F5 for ; Tue, 27 Apr 2021 20:44:02 +0000 (UTC) IronPort-SDR: pHY3/nPSL56lpkRbp2nqcWIxyGIg2Sw3NwhZ0OLm/MOnhTp37S/8YBSlKG0QGB8cv1NCS7APPw MPupxlnEts1g== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473663" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473663" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:12 -0700 IronPort-SDR: fhIQad1STiOfMO1Xl7m7hbTbq5zRyOnKXEPM5z/eyo+DYYh0yiHi5PmW/8xS+Hul1JCotpf8Kf rUsRrFJTFOzA== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623473" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:11 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" , David Airlie , Joonas Lahtinen , Jani Nikula , Daniel Vetter , Rodrigo Vivi , Zhenyu Wang , Zhi Wang Subject: [PATCH v26 10/30] drm/i915/gvt: Change _PAGE_DIRTY to _PAGE_DIRTY_BITS Date: Tue, 27 Apr 2021 13:42:55 -0700 Message-Id: <20210427204315.24153-11-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: DB7CA40002F5 X-Stat-Signature: 186j9j53ro374axu3jsa9rgypcgbisx4 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556242-1955 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: After the introduction of _PAGE_COW, a modified page's PTE can have either _PAGE_DIRTY or _PAGE_COW. Change _PAGE_DIRTY to _PAGE_DIRTY_BITS. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook Reviewed-by: Kirill A. Shutemov Cc: David Airlie Cc: Joonas Lahtinen Cc: Jani Nikula Cc: Daniel Vetter Cc: Rodrigo Vivi Cc: Zhenyu Wang Cc: Zhi Wang --- drivers/gpu/drm/i915/gvt/gtt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c index 897c007ea96a..937b6083b2dc 100644 --- a/drivers/gpu/drm/i915/gvt/gtt.c +++ b/drivers/gpu/drm/i915/gvt/gtt.c @@ -1216,7 +1216,7 @@ static int split_2MB_gtt_entry(struct intel_vgpu *vgpu, } /* Clear dirty field. */ - se->val64 &= ~_PAGE_DIRTY; + se->val64 &= ~_PAGE_DIRTY_BITS; ops->clear_pse(se); ops->clear_ips(se); From patchwork Tue Apr 27 20:42:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227245 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AF38C43603 for ; Tue, 27 Apr 2021 20:44:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1BD1160720 for ; Tue, 27 Apr 2021 20:44:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1BD1160720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 140FD6B007B; Tue, 27 Apr 2021 16:44:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 077D36B007E; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE5BB6B0080; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A35C16B007B for ; Tue, 27 Apr 2021 16:44:14 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 630894DDF for ; Tue, 27 Apr 2021 20:44:14 +0000 (UTC) X-FDA: 78079324428.06.66A9BBC Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id 47ED32000247 for ; Tue, 27 Apr 2021 20:44:15 +0000 (UTC) IronPort-SDR: iXDcHq6SC2rYd6WIIbdgETZQwaZJwCJ5HoDCC5nbYE1F0jwLksDSFXzVRLpdUDDuhHV2ug/f4/ P2rqzvUxevqQ== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473667" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473667" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:13 -0700 IronPort-SDR: GiOu6oGZC8O6G8a2/HSiJ96TGKrREMZ7zhgi84VlXsBeJAxvwBZEF7I6QU9NQblIGwA95wFEnf UFoV9HxkYXHw== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623479" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:12 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 11/30] x86/mm: Update pte_modify for _PAGE_COW Date: Tue, 27 Apr 2021 13:42:56 -0700 Message-Id: <20210427204315.24153-12-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 47ED32000247 X-Stat-Signature: 3zmwigq3pcon1zixo1r49aopmd1fj6t9 X-Rspamd-Server: rspam02 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556255-874478 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The read-only and Dirty PTE has been used to indicate copy-on-write pages. However, newer x86 processors also regard a read-only and Dirty PTE as a shadow stack page. In order to separate the two, the software-defined _PAGE_COW is created to replace _PAGE_DIRTY for the copy-on-write case, and pte_*() are updated. Pte_modify() changes a PTE to 'newprot', but it doesn't use the pte_*(). Introduce fixup_dirty_pte(), which sets a dirty PTE, based on _PAGE_RW, to either _PAGE_DIRTY or _PAGE_COW. Apply the same changes to pmd_modify(). Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov --- arch/x86/include/asm/pgtable.h | 37 ++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 9c056d5815de..e1739f590ca6 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -799,6 +799,23 @@ static inline pmd_t pmd_mkinvalid(pmd_t pmd) static inline u64 flip_protnone_guard(u64 oldval, u64 val, u64 mask); +static inline pteval_t fixup_dirty_pte(pteval_t pteval) +{ + pte_t pte = __pte(pteval); + + /* + * Fix up potential shadow stack page flags because the RO, Dirty + * PTE is special. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + if (pte_dirty(pte)) { + pte = pte_mkclean(pte); + pte = pte_mkdirty(pte); + } + } + return pte_val(pte); +} + static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) { pteval_t val = pte_val(pte), oldval = val; @@ -809,16 +826,36 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) */ val &= _PAGE_CHG_MASK; val |= check_pgprot(newprot) & ~_PAGE_CHG_MASK; + val = fixup_dirty_pte(val); val = flip_protnone_guard(oldval, val, PTE_PFN_MASK); return __pte(val); } +static inline int pmd_write(pmd_t pmd); +static inline pmdval_t fixup_dirty_pmd(pmdval_t pmdval) +{ + pmd_t pmd = __pmd(pmdval); + + /* + * Fix up potential shadow stack page flags because the RO, Dirty + * PMD is special. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + if (pmd_dirty(pmd)) { + pmd = pmd_mkclean(pmd); + pmd = pmd_mkdirty(pmd); + } + } + return pmd_val(pmd); +} + static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) { pmdval_t val = pmd_val(pmd), oldval = val; val &= _HPAGE_CHG_MASK; val |= check_pgprot(newprot) & ~_HPAGE_CHG_MASK; + val = fixup_dirty_pmd(val); val = flip_protnone_guard(oldval, val, PHYSICAL_PMD_PAGE_MASK); return __pmd(val); } From patchwork Tue Apr 27 20:42:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D12C0C433B4 for ; Tue, 27 Apr 2021 20:44:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7CB2660720 for ; Tue, 27 Apr 2021 20:44:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7CB2660720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BC5256B007E; Tue, 27 Apr 2021 16:44:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A0F26B0080; Tue, 27 Apr 2021 16:44:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F5326B0081; Tue, 27 Apr 2021 16:44:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0042.hostedemail.com [216.40.44.42]) by kanga.kvack.org (Postfix) with ESMTP id 5A73C6B007E for ; Tue, 27 Apr 2021 16:44:15 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1F745246E for ; Tue, 27 Apr 2021 20:44:15 +0000 (UTC) X-FDA: 78079324470.29.AA1EF67 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id 11CE02000271 for ; Tue, 27 Apr 2021 20:44:15 +0000 (UTC) IronPort-SDR: IemRhTGkPGaiGkCcZkDs2w3t5iw9KkbLd2LZufwHdgj2ptutBsobnhaPpG43CbhobMsqp0EKIK /z4ozIgHQ71g== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473669" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473669" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:14 -0700 IronPort-SDR: rSbJOHYWKU44KGErZE0rybPPRTg8QpnuigGhKK94jTQw0kWk+G1SMwD+BpPauzXKgkBonA1TEU 3QrunAR6FGIQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623486" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:13 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 12/30] x86/mm: Update ptep_set_wrprotect() and pmdp_set_wrprotect() for transition from _PAGE_DIRTY to _PAGE_COW Date: Tue, 27 Apr 2021 13:42:57 -0700 Message-Id: <20210427204315.24153-13-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 11CE02000271 X-Stat-Signature: h8x7zgq13r9e7mnbgukna1be9efmpdch X-Rspamd-Server: rspam02 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556255-325010 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When Shadow Stack is introduced, [R/O + _PAGE_DIRTY] PTE is reserved for shadow stack. Copy-on-write PTEs have [R/O + _PAGE_COW]. When a PTE goes from [R/W + _PAGE_DIRTY] to [R/O + _PAGE_COW], it could become a transient shadow stack PTE in two cases: The first case is that some processors can start a write but end up seeing a read-only PTE by the time they get to the Dirty bit, creating a transient shadow stack PTE. However, this will not occur on processors supporting Shadow Stack, and a TLB flush is not necessary. The second case is that when _PAGE_DIRTY is replaced with _PAGE_COW non- atomically, a transient shadow stack PTE can be created as a result. Thus, prevent that with cmpxchg. Dave Hansen, Jann Horn, Andy Lutomirski, and Peter Zijlstra provided many insights to the issue. Jann Horn provided the cmpxchg solution. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook Reviewed-by: Kirill A. Shutemov --- arch/x86/include/asm/pgtable.h | 36 ++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index e1739f590ca6..46d9394b884f 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1306,6 +1306,24 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { + /* + * If Shadow Stack is enabled, pte_wrprotect() moves _PAGE_DIRTY + * to _PAGE_COW (see comments at pte_wrprotect()). + * When a thread reads a RW=1, Dirty=0 PTE and before changing it + * to RW=0, Dirty=0, another thread could have written to the page + * and the PTE is RW=1, Dirty=1 now. Use try_cmpxchg() to detect + * PTE changes and update old_pte, then try again. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pte_t old_pte, new_pte; + + old_pte = READ_ONCE(*ptep); + do { + new_pte = pte_wrprotect(old_pte); + } while (!try_cmpxchg(&ptep->pte, &old_pte.pte, new_pte.pte)); + + return; + } clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); } @@ -1350,6 +1368,24 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { + /* + * If Shadow Stack is enabled, pmd_wrprotect() moves _PAGE_DIRTY + * to _PAGE_COW (see comments at pmd_wrprotect()). + * When a thread reads a RW=1, Dirty=0 PMD and before changing it + * to RW=0, Dirty=0, another thread could have written to the page + * and the PMD is RW=1, Dirty=1 now. Use try_cmpxchg() to detect + * PMD changes and update old_pmd, then try again. + */ + if (cpu_feature_enabled(X86_FEATURE_SHSTK)) { + pmd_t old_pmd, new_pmd; + + old_pmd = READ_ONCE(*pmdp); + do { + new_pmd = pmd_wrprotect(old_pmd); + } while (!try_cmpxchg((pmdval_t *)pmdp, (pmdval_t *)&old_pmd, pmd_val(new_pmd))); + + return; + } clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp); } From patchwork Tue Apr 27 20:42:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 213EAC43603 for ; Tue, 27 Apr 2021 20:44:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B924F613F8 for ; Tue, 27 Apr 2021 20:44:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B924F613F8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 867DC6B0080; Tue, 27 Apr 2021 16:44:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7F6936B0081; Tue, 27 Apr 2021 16:44:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CFBD6B0082; Tue, 27 Apr 2021 16:44:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0154.hostedemail.com [216.40.44.154]) by kanga.kvack.org (Postfix) with ESMTP id 311EC6B0080 for ; Tue, 27 Apr 2021 16:44:16 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D52FB8249980 for ; Tue, 27 Apr 2021 20:44:15 +0000 (UTC) X-FDA: 78079324470.10.F2E3C20 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf07.hostedemail.com (Postfix) with ESMTP id E2021A00018C for ; Tue, 27 Apr 2021 20:44:14 +0000 (UTC) IronPort-SDR: 9MFnJxjFwIoLO/j0dYJ0ZHnitRsk28V7qF/Vn0qze6P4/Q0VG0T/f3Fz3DNtRPy3tuQ8ciQp7A 6f0Nlumv4yiQ== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473672" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473672" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:14 -0700 IronPort-SDR: fPQDCgww/rvIQCz4XzmmPJ6YWN3uFmotycfnhwhLnt+5VNk5846wUueDBHbfJE6q5HYKmwekrj uerC1/LhuDFw== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623499" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:14 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 13/30] mm: Introduce VM_SHADOW_STACK for shadow stack memory Date: Tue, 27 Apr 2021 13:42:58 -0700 Message-Id: <20210427204315.24153-14-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E2021A00018C X-Stat-Signature: zjzfibn9q9m8py15kbnrkassfkefxpbs Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf07; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556254-230152 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A shadow stack PTE must be read-only and have _PAGE_DIRTY set. However, read-only and Dirty PTEs also exist for copy-on-write (COW) pages. These two cases are handled differently for page faults. Introduce VM_SHADOW_STACK to track shadow stack VMAs. Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov Cc: Kees Cook --- v24: - Change VM_SHSTK to VM_SHADOW_STACK. - Change CONFIG_X86_CET to CONFIG_X86_SHADOW_STACK to reflect Kconfig changes. Documentation/filesystems/proc.rst | 1 + arch/x86/mm/mmap.c | 2 ++ fs/proc/task_mmu.c | 3 +++ include/linux/mm.h | 8 ++++++++ 4 files changed, 14 insertions(+) diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 48fbfc336ebf..5d8a2d75c799 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -549,6 +549,7 @@ encoded manner. The codes are the following: mg mergable advise flag bt arm64 BTI guarded page mt arm64 MTE allocation tags are enabled + ss shadow stack page == ======================================= Note that there is no guarantee that every flag and associated mnemonic will diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index c90c20904a60..f3f52c5e2fd6 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -165,6 +165,8 @@ unsigned long get_mmap_base(int is_legacy) const char *arch_vma_name(struct vm_area_struct *vma) { + if (vma->vm_flags & VM_SHADOW_STACK) + return "[shadow stack]"; return NULL; } diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index e862cab69583..0aa57de9dfab 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -661,6 +661,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_PKEY_BIT4)] = "", #endif #endif /* CONFIG_ARCH_HAS_PKEYS */ +#ifdef CONFIG_ARCH_HAS_SHADOW_STACK + [ilog2(VM_SHADOW_STACK)]= "ss", +#endif }; size_t i; diff --git a/include/linux/mm.h b/include/linux/mm.h index 8ba434287387..08282eb2f195 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -312,11 +312,13 @@ extern unsigned int kobjsize(const void *objp); #define VM_HIGH_ARCH_BIT_2 34 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_3 35 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_4 36 /* bit only usable on 64-bit architectures */ +#define VM_HIGH_ARCH_BIT_5 37 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_0 BIT(VM_HIGH_ARCH_BIT_0) #define VM_HIGH_ARCH_1 BIT(VM_HIGH_ARCH_BIT_1) #define VM_HIGH_ARCH_2 BIT(VM_HIGH_ARCH_BIT_2) #define VM_HIGH_ARCH_3 BIT(VM_HIGH_ARCH_BIT_3) #define VM_HIGH_ARCH_4 BIT(VM_HIGH_ARCH_BIT_4) +#define VM_HIGH_ARCH_5 BIT(VM_HIGH_ARCH_BIT_5) #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */ #ifdef CONFIG_ARCH_HAS_PKEYS @@ -332,6 +334,12 @@ extern unsigned int kobjsize(const void *objp); #endif #endif /* CONFIG_ARCH_HAS_PKEYS */ +#ifdef CONFIG_X86_SHADOW_STACK +# define VM_SHADOW_STACK VM_HIGH_ARCH_5 +#else +# define VM_SHADOW_STACK VM_NONE +#endif + #if defined(CONFIG_X86) # define VM_PAT VM_ARCH_1 /* PAT reserves whole VMA at once (x86) */ #elif defined(CONFIG_PPC) From patchwork Tue Apr 27 20:42:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58553C4361A for ; Tue, 27 Apr 2021 20:44:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E4C0F60720 for ; Tue, 27 Apr 2021 20:44:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E4C0F60720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2E9CE6B0081; Tue, 27 Apr 2021 16:44:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 277466B0082; Tue, 27 Apr 2021 16:44:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0C6FC6B0083; Tue, 27 Apr 2021 16:44:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0142.hostedemail.com [216.40.44.142]) by kanga.kvack.org (Postfix) with ESMTP id CA0626B0081 for ; Tue, 27 Apr 2021 16:44:16 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 9459B4417 for ; Tue, 27 Apr 2021 20:44:16 +0000 (UTC) X-FDA: 78079324512.09.31BAB3F Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id 6CEED2000261 for ; Tue, 27 Apr 2021 20:44:17 +0000 (UTC) IronPort-SDR: RxNb0DkcY4elo+30XqcK0sw82nDyAmo1AC2FaT2pHNPlJ9Bq21qY6PNsD2GBJiqJdaofBi8bZo 00PNsxfbQ0+w== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473676" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473676" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:15 -0700 IronPort-SDR: JTsITtzNjQfEiRw5TlsGstnNUL1WZhpobfNtlVwhnCZSpOy29dMh8GWRFxMseh4aRrZg8ajO2r 8DDrnmvmpNZA== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623511" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:14 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 14/30] x86/mm: Shadow Stack page fault error checking Date: Tue, 27 Apr 2021 13:42:59 -0700 Message-Id: <20210427204315.24153-15-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 6CEED2000261 X-Stat-Signature: 98849zegieb63e1ffc7yux7rngg768r7 X-Rspamd-Server: rspam02 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556257-459528 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Shadow stack accesses are those that are performed by the CPU where it expects to encounter a shadow stack mapping. These accesses are performed implicitly by CALL/RET at the site of the shadow stack pointer. These accesses are made explicitly by shadow stack management instructions like WRUSSQ. Shadow stacks accesses to shadow-stack mapping can see faults in normal, valid operation just like regular accesses to regular mappings. Shadow stacks need some of the same features like delayed allocation, swap and copy-on-write. Shadow stack accesses can also result in errors, such as when a shadow stack overflows, or if a shadow stack access occurs to a non-shadow-stack mapping. In handling a shadow stack page fault, verify it occurs within a shadow stack mapping. It is always an error otherwise. For valid shadow stack accesses, set FAULT_FLAG_WRITE to effect copy-on-write. Because clearing _PAGE_DIRTY (vs. _PAGE_RW) is used to trigger the fault, shadow stack read fault and shadow stack write fault are not differentiated and both are handled as a write access. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook Reviewed-by: Kirill A. Shutemov --- v24: - Change VM_SHSTK to VM_SHADOW_STACK. arch/x86/include/asm/trap_pf.h | 2 ++ arch/x86/mm/fault.c | 19 +++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/arch/x86/include/asm/trap_pf.h b/arch/x86/include/asm/trap_pf.h index 10b1de500ab1..afa524325e55 100644 --- a/arch/x86/include/asm/trap_pf.h +++ b/arch/x86/include/asm/trap_pf.h @@ -11,6 +11,7 @@ * bit 3 == 1: use of reserved bit detected * bit 4 == 1: fault was an instruction fetch * bit 5 == 1: protection keys block access + * bit 6 == 1: shadow stack access fault * bit 15 == 1: SGX MMU page-fault */ enum x86_pf_error_code { @@ -20,6 +21,7 @@ enum x86_pf_error_code { X86_PF_RSVD = 1 << 3, X86_PF_INSTR = 1 << 4, X86_PF_PK = 1 << 5, + X86_PF_SHSTK = 1 << 6, X86_PF_SGX = 1 << 15, }; diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index a73347e2cdfc..394e504305b7 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1100,6 +1100,17 @@ access_error(unsigned long error_code, struct vm_area_struct *vma) (error_code & X86_PF_INSTR), foreign)) return 1; + /* + * Verify a shadow stack access is within a shadow stack VMA. + * It is always an error otherwise. Normal data access to a + * shadow stack area is checked in the case followed. + */ + if (error_code & X86_PF_SHSTK) { + if (!(vma->vm_flags & VM_SHADOW_STACK)) + return 1; + return 0; + } + if (error_code & X86_PF_WRITE) { /* write, present and write, not present: */ if (unlikely(!(vma->vm_flags & VM_WRITE))) @@ -1293,6 +1304,14 @@ void do_user_addr_fault(struct pt_regs *regs, perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); + /* + * Clearing _PAGE_DIRTY is used to detect shadow stack access. + * This method cannot distinguish shadow stack read vs. write. + * For valid shadow stack accesses, set FAULT_FLAG_WRITE to effect + * copy-on-write. + */ + if (error_code & X86_PF_SHSTK) + flags |= FAULT_FLAG_WRITE; if (error_code & X86_PF_WRITE) flags |= FAULT_FLAG_WRITE; if (error_code & X86_PF_INSTR) From patchwork Tue Apr 27 20:43:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FB71C43460 for ; Tue, 27 Apr 2021 20:44:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DFB3A60720 for ; Tue, 27 Apr 2021 20:44:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DFB3A60720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DC0B76B0082; Tue, 27 Apr 2021 16:44:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D240C6B0083; Tue, 27 Apr 2021 16:44:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B4DCE6B0085; Tue, 27 Apr 2021 16:44:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0245.hostedemail.com [216.40.44.245]) by kanga.kvack.org (Postfix) with ESMTP id 8F7526B0082 for ; Tue, 27 Apr 2021 16:44:17 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 544A6180AD81A for ; Tue, 27 Apr 2021 20:44:17 +0000 (UTC) X-FDA: 78079324554.25.53E1507 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf07.hostedemail.com (Postfix) with ESMTP id 26FBDA000187 for ; Tue, 27 Apr 2021 20:44:15 +0000 (UTC) IronPort-SDR: pAwwbI40f5g+enmNjCzHZYrSztGMtprOGQ36Z/Hwk3oa7UmFYWcNHkQ/Uhgt742sPYxvY+bvqk A75fK5qnfj3Q== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473679" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473679" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:16 -0700 IronPort-SDR: b+IqZB8TbrWPXFZwoFff+l6S2gaMlSjz2O1nskI0ZpK48AzDREpe4vmGWd+bAk+SZdf7mHXbyk yswdPssc7eBA== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623514" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:15 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 15/30] x86/mm: Update maybe_mkwrite() for shadow stack Date: Tue, 27 Apr 2021 13:43:00 -0700 Message-Id: <20210427204315.24153-16-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 26FBDA000187 X-Stat-Signature: bujetj8tusy1w9h1ijzf1gyuchsca4fq Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf07; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556255-687087 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When serving a page fault, maybe_mkwrite() makes a PTE writable if its vma has VM_WRITE. A shadow stack vma has VM_SHADOW_STACK. Its PTEs have _PAGE_DIRTY, but not _PAGE_WRITE. In fork(), _PAGE_DIRTY is cleared to cause copy-on-write, and in the page fault handler, _PAGE_DIRTY is restored and the shadow stack page is writable again. Introduce an x86 version of maybe_mkwrite(), which sets proper PTE bits according to VM flags. Apply the same changes to maybe_pmd_mkwrite(). Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov Cc: Kees Cook --- v24: - Instead of doing arch_maybe_mkwrite(), overwrite maybe*_mkwrite() with x86 versions. - Change VM_SHSTK to VM_SHADOW_STACK. arch/x86/include/asm/pgtable.h | 6 ++++++ arch/x86/mm/pgtable.c | 20 ++++++++++++++++++++ include/linux/mm.h | 2 ++ mm/huge_memory.c | 2 ++ 4 files changed, 30 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 46d9394b884f..da5dea417663 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -308,6 +308,9 @@ static inline int pmd_trans_huge(pmd_t pmd) return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; } +#define maybe_pmd_mkwrite maybe_pmd_mkwrite +extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma); + #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD static inline int pud_trans_huge(pud_t pud) { @@ -1686,6 +1689,9 @@ static inline bool arch_faults_on_old_pte(void) return false; } +#define maybe_mkwrite maybe_mkwrite +extern pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma); + #endif /* __ASSEMBLY__ */ #endif /* _ASM_X86_PGTABLE_H */ diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index f6a9e2e36642..e778dbbef3d8 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -610,6 +610,26 @@ int pmdp_clear_flush_young(struct vm_area_struct *vma, } #endif +pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma) +{ + if (likely(vma->vm_flags & VM_WRITE)) + pte = pte_mkwrite(pte); + else if (likely(vma->vm_flags & VM_SHADOW_STACK)) + pte = pte_mkwrite_shstk(pte); + return pte; +} + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) +{ + if (likely(vma->vm_flags & VM_WRITE)) + pmd = pmd_mkwrite(pmd); + else if (likely(vma->vm_flags & VM_SHADOW_STACK)) + pmd = pmd_mkwrite_shstk(pmd); + return pmd; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + /** * reserve_top_address - reserves a hole in the top of kernel address space * @reserve - size of hole to reserve diff --git a/include/linux/mm.h b/include/linux/mm.h index 08282eb2f195..6ac9b3e9a865 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -993,12 +993,14 @@ void free_compound_page(struct page *page); * pte_mkwrite. But get_user_pages can cause write faults for mappings * that do not have writing enabled, when used by access_process_vm. */ +#ifndef maybe_mkwrite static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma) { if (likely(vma->vm_flags & VM_WRITE)) pte = pte_mkwrite(pte); return pte; } +#endif vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page); void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index ae907a9c2050..8203bd6ae4bd 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -478,12 +478,14 @@ static int __init setup_transparent_hugepage(char *str) } __setup("transparent_hugepage=", setup_transparent_hugepage); +#ifndef maybe_pmd_mkwrite pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) { if (likely(vma->vm_flags & VM_WRITE)) pmd = pmd_mkwrite(pmd); return pmd; } +#endif #ifdef CONFIG_MEMCG static inline struct deferred_split *get_deferred_split_queue(struct page *page) From patchwork Tue Apr 27 20:43:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F212C43461 for ; Tue, 27 Apr 2021 20:44:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F207E613F3 for ; Tue, 27 Apr 2021 20:44:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F207E613F3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 757096B0083; Tue, 27 Apr 2021 16:44:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 708786B0085; Tue, 27 Apr 2021 16:44:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4BDB26B0087; Tue, 27 Apr 2021 16:44:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0218.hostedemail.com [216.40.44.218]) by kanga.kvack.org (Postfix) with ESMTP id 1F7F86B0083 for ; Tue, 27 Apr 2021 16:44:18 -0400 (EDT) Received: from smtpin35.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D00D8180AD81A for ; Tue, 27 Apr 2021 20:44:17 +0000 (UTC) X-FDA: 78079324554.35.208B2A2 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id A29FE2000247 for ; Tue, 27 Apr 2021 20:44:18 +0000 (UTC) IronPort-SDR: Beija8km84oX0A6h8P07XWDdy88mdDxCqp/IDO/34wrG0BVLfvUjqkDenNusmVb3qPk5rDkLiu kUy0kRGLypNQ== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473684" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473684" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:16 -0700 IronPort-SDR: r3TWbtMTBiUQNhzR3TWb82UOlWWTti3U9Z4774f4nj3VScF6wnlqZJmSzr2xnxXA8uZWc1jDtW DSl4Zm+9kPyg== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623518" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:16 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 16/30] mm: Fixup places that call pte_mkwrite() directly Date: Tue, 27 Apr 2021 13:43:01 -0700 Message-Id: <20210427204315.24153-17-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: A29FE2000247 X-Stat-Signature: gt4oqrx6s8cub659cxyit84c6rfp5e4e X-Rspamd-Server: rspam02 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556258-65902 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When serving a page fault, maybe_mkwrite() makes a PTE writable if it is in a writable vma. A shadow stack vma is writable, but its PTEs need _PAGE_DIRTY to be set to become writable. For this reason, maybe_mkwrite() has been updated. There are a few places that call pte_mkwrite() directly, but have the same result as from maybe_mkwrite(). These sites need to be updated for shadow stack as well. Thus, change them to maybe_mkwrite(): - do_anonymous_page() and migrate_vma_insert_page() check VM_WRITE directly and call pte_mkwrite(), which is the same as maybe_mkwrite(). Change them to maybe_mkwrite(). - In do_numa_page(), if the numa entry was writable, then pte_mkwrite() is called directly. Fix it by doing maybe_mkwrite(). Make the same changes to do_huge_pmd_numa_page(). - In change_pte_range(), pte_mkwrite() is called directly. Replace it with maybe_mkwrite(). Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov Cc: Kees Cook --- v25: - Apply same changes to do_huge_pmd_numa_page() as to do_numa_page(). mm/huge_memory.c | 2 +- mm/memory.c | 5 ++--- mm/migrate.c | 3 +-- mm/mprotect.c | 2 +- 4 files changed, 5 insertions(+), 7 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8203bd6ae4bd..044029ef45cd 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1553,7 +1553,7 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd) pmd = pmd_modify(pmd, vma->vm_page_prot); pmd = pmd_mkyoung(pmd); if (was_writable) - pmd = pmd_mkwrite(pmd); + pmd = maybe_pmd_mkwrite(pmd, vma); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, pmd); update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); unlock_page(page); diff --git a/mm/memory.c b/mm/memory.c index 550405fc3b5e..0cb1028b1c02 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3561,8 +3561,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) __SetPageUptodate(page); entry = mk_pte(page, vma->vm_page_prot); - if (vma->vm_flags & VM_WRITE) - entry = pte_mkwrite(pte_mkdirty(entry)); + entry = maybe_mkwrite(pte_mkdirty(entry), vma); vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); @@ -4125,7 +4124,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) pte = pte_modify(old_pte, vma->vm_page_prot); pte = pte_mkyoung(pte); if (was_writable) - pte = pte_mkwrite(pte); + pte = maybe_mkwrite(pte, vma); ptep_modify_prot_commit(vma, vmf->address, vmf->pte, old_pte, pte); update_mmu_cache(vma, vmf->address, vmf->pte); diff --git a/mm/migrate.c b/mm/migrate.c index 62b81d5257aa..7251c88a3d64 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2976,8 +2976,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, } } else { entry = mk_pte(page, vma->vm_page_prot); - if (vma->vm_flags & VM_WRITE) - entry = pte_mkwrite(pte_mkdirty(entry)); + entry = maybe_mkwrite(pte_mkdirty(entry), vma); } ptep = pte_offset_map_lock(mm, pmdp, addr, &ptl); diff --git a/mm/mprotect.c b/mm/mprotect.c index 94188df1ee55..c1ce78d688b6 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -135,7 +135,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, if (dirty_accountable && pte_dirty(ptent) && (pte_soft_dirty(ptent) || !(vma->vm_flags & VM_SOFTDIRTY))) { - ptent = pte_mkwrite(ptent); + ptent = maybe_mkwrite(ptent, vma); } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); pages++; From patchwork Tue Apr 27 20:43:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227257 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2D46C433ED for ; Tue, 27 Apr 2021 20:44:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4E450613FE for ; Tue, 27 Apr 2021 20:44:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4E450613FE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A14DE6B0085; Tue, 27 Apr 2021 16:44:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A2736B0087; Tue, 27 Apr 2021 16:44:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6DFDB6B0088; Tue, 27 Apr 2021 16:44:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0236.hostedemail.com [216.40.44.236]) by kanga.kvack.org (Postfix) with ESMTP id 497ED6B0085 for ; Tue, 27 Apr 2021 16:44:20 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 05D0B181AEF00 for ; Tue, 27 Apr 2021 20:44:20 +0000 (UTC) X-FDA: 78079324680.22.6B83B82 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf21.hostedemail.com (Postfix) with ESMTP id 15772E00012B for ; Tue, 27 Apr 2021 20:44:15 +0000 (UTC) IronPort-SDR: UuIn8mAry5C89OMP7Tehg4tCC0PfH0yFZ1rdt/IC8ITH8RCCohInH7tE/Lw25vZ4Kux6W6ZpYh sBi7kVLEtEeA== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473691" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473691" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:18 -0700 IronPort-SDR: X4XEh26uRW5SCsYJRhadYcziflBR+F11pNRCZDRYRlgwXTEVWFoSxTpPvuUGbuSOsrdopsYXxB YvOdSAUtODww== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623524" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:16 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 17/30] mm: Add guard pages around a shadow stack. Date: Tue, 27 Apr 2021 13:43:02 -0700 Message-Id: <20210427204315.24153-18-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Stat-Signature: ia6ktm6o4sdg98g6ez9us4jfioheind1 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 15772E00012B Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556255-871200 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: INCSSP(Q/D) increments shadow stack pointer and 'pops and discards' the first and the last elements in the range, effectively touches those memory areas. The maximum moving distance by INCSSPQ is 255 * 8 = 2040 bytes and 255 * 4 = 1020 bytes by INCSSPD. Both ranges are far from PAGE_SIZE. Thus, putting a gap page on both ends of a shadow stack prevents INCSSP, CALL, and RET from going beyond. Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov Cc: Kees Cook --- v25: - Move SHADOW_STACK_GUARD_GAP to arch/x86/mm/mmap.c. v24: - Instead changing vm_*_gap(), create x86-specific versions. arch/x86/include/asm/page_types.h | 7 +++++ arch/x86/mm/mmap.c | 46 +++++++++++++++++++++++++++++++ include/linux/mm.h | 4 +++ 3 files changed, 57 insertions(+) diff --git a/arch/x86/include/asm/page_types.h b/arch/x86/include/asm/page_types.h index a506a411474d..e1533fdc08b4 100644 --- a/arch/x86/include/asm/page_types.h +++ b/arch/x86/include/asm/page_types.h @@ -73,6 +73,13 @@ bool pfn_range_is_mapped(unsigned long start_pfn, unsigned long end_pfn); extern void initmem_init(void); +#define vm_start_gap vm_start_gap +struct vm_area_struct; +extern unsigned long vm_start_gap(struct vm_area_struct *vma); + +#define vm_end_gap vm_end_gap +extern unsigned long vm_end_gap(struct vm_area_struct *vma); + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_PAGE_DEFS_H */ diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index f3f52c5e2fd6..81f9325084d3 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -250,3 +250,49 @@ bool pfn_modify_allowed(unsigned long pfn, pgprot_t prot) return false; return true; } + +/* + * Shadow stack pointer is moved by CALL, RET, and INCSSP(Q/D). INCSSPQ + * moves shadow stack pointer up to 255 * 8 = ~2 KB (~1KB for INCSSPD) and + * touches the first and the last element in the range, which triggers a + * page fault if the range is not in a shadow stack. Because of this, + * creating 4-KB guard pages around a shadow stack prevents these + * instructions from going beyond. + */ +#define SHADOW_STACK_GUARD_GAP PAGE_SIZE + +unsigned long vm_start_gap(struct vm_area_struct *vma) +{ + unsigned long vm_start = vma->vm_start; + unsigned long gap = 0; + + if (vma->vm_flags & VM_GROWSDOWN) + gap = stack_guard_gap; + else if (vma->vm_flags & VM_SHADOW_STACK) + gap = SHADOW_STACK_GUARD_GAP; + + if (gap != 0) { + vm_start -= gap; + if (vm_start > vma->vm_start) + vm_start = 0; + } + return vm_start; +} + +unsigned long vm_end_gap(struct vm_area_struct *vma) +{ + unsigned long vm_end = vma->vm_end; + unsigned long gap = 0; + + if (vma->vm_flags & VM_GROWSUP) + gap = stack_guard_gap; + else if (vma->vm_flags & VM_SHADOW_STACK) + gap = SHADOW_STACK_GUARD_GAP; + + if (gap != 0) { + vm_end += gap; + if (vm_end < vma->vm_end) + vm_end = -PAGE_SIZE; + } + return vm_end; +} diff --git a/include/linux/mm.h b/include/linux/mm.h index 6ac9b3e9a865..3e9c84f21ef6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2660,6 +2660,7 @@ static inline struct vm_area_struct * find_vma_intersection(struct mm_struct * m return vma; } +#ifndef vm_start_gap static inline unsigned long vm_start_gap(struct vm_area_struct *vma) { unsigned long vm_start = vma->vm_start; @@ -2671,7 +2672,9 @@ static inline unsigned long vm_start_gap(struct vm_area_struct *vma) } return vm_start; } +#endif +#ifndef vm_end_gap static inline unsigned long vm_end_gap(struct vm_area_struct *vma) { unsigned long vm_end = vma->vm_end; @@ -2683,6 +2686,7 @@ static inline unsigned long vm_end_gap(struct vm_area_struct *vma) } return vm_end; } +#endif static inline unsigned long vma_pages(struct vm_area_struct *vma) { From patchwork Tue Apr 27 20:43:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227259 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFF54C43470 for ; Tue, 27 Apr 2021 20:44:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 966E5613E7 for ; Tue, 27 Apr 2021 20:44:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 966E5613E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 708F66B0087; Tue, 27 Apr 2021 16:44:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6928F6B0088; Tue, 27 Apr 2021 16:44:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 472CA6B0089; Tue, 27 Apr 2021 16:44:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0046.hostedemail.com [216.40.44.46]) by kanga.kvack.org (Postfix) with ESMTP id 150566B0087 for ; Tue, 27 Apr 2021 16:44:21 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D1705180AD81A for ; Tue, 27 Apr 2021 20:44:20 +0000 (UTC) X-FDA: 78079324680.21.2C5CE19 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf21.hostedemail.com (Postfix) with ESMTP id D0514E00011A for ; Tue, 27 Apr 2021 20:44:16 +0000 (UTC) IronPort-SDR: 7toGLBWEjNAqbtLMD3/MQGX46WK43/MFPlgwcBxBbqnXhLeoNThif4xTzKxm/ZjQhM3z8pgfvB 9rT+J/sbWOVQ== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473694" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473694" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:18 -0700 IronPort-SDR: dM0R5wTMXYwpzyKBQurBFk62wo0FSpBQUK57Ki20/BKmhEfoS5LHP+hQ3m4G9A3MrpvFZEUc0F XyMeDTJkqonQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623528" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:18 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 18/30] mm/mmap: Add shadow stack pages to memory accounting Date: Tue, 27 Apr 2021 13:43:03 -0700 Message-Id: <20210427204315.24153-19-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Stat-Signature: 6ot3uhkyt1e99pb1hnzbnumhjf4prkjw X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D0514E00011A Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556256-280267 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Account shadow stack pages to stack memory. Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov Cc: Kees Cook --- v26: - Remove redundant #ifdef CONFIG_MMU. v25: - Remove #ifdef CONFIG_ARCH_HAS_SHADOW_STACK for is_shadow_stack_mapping(). v24: - Change arch_shadow_stack_mapping() to is_shadow_stack_mapping(). - Change VM_SHSTK to VM_SHADOW_STACK. arch/x86/include/asm/pgtable.h | 3 +++ arch/x86/mm/pgtable.c | 5 +++++ include/linux/pgtable.h | 7 +++++++ mm/mmap.c | 5 +++++ 4 files changed, 20 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index da5dea417663..7f324edaedfa 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1692,6 +1692,9 @@ static inline bool arch_faults_on_old_pte(void) #define maybe_mkwrite maybe_mkwrite extern pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma); +#define is_shadow_stack_mapping is_shadow_stack_mapping +extern bool is_shadow_stack_mapping(vm_flags_t vm_flags); + #endif /* __ASSEMBLY__ */ #endif /* _ASM_X86_PGTABLE_H */ diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index e778dbbef3d8..b6ce620922e0 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -897,3 +897,8 @@ int pmd_free_pte_page(pmd_t *pmd, unsigned long addr) #endif /* CONFIG_X86_64 */ #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */ + +bool is_shadow_stack_mapping(vm_flags_t vm_flags) +{ + return vm_flags & VM_SHADOW_STACK; +} diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 5e772392a379..d9af6f9aeef1 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1446,6 +1446,13 @@ static inline bool arch_has_pfn_modify_check(void) } #endif /* !_HAVE_ARCH_PFN_MODIFY_ALLOWED */ +#ifndef is_shadow_stack_mapping +static inline bool is_shadow_stack_mapping(vm_flags_t vm_flags) +{ + return false; +} +#endif + /* * Architecture PAGE_KERNEL_* fallbacks * diff --git a/mm/mmap.c b/mm/mmap.c index 3f287599a7a3..d77fb39b6ab5 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1718,6 +1718,9 @@ static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags) if (file && is_file_hugepages(file)) return 0; + if (is_shadow_stack_mapping(vm_flags)) + return 1; + return (vm_flags & (VM_NORESERVE | VM_SHARED | VM_WRITE)) == VM_WRITE; } @@ -3387,6 +3390,8 @@ void vm_stat_account(struct mm_struct *mm, vm_flags_t flags, long npages) mm->stack_vm += npages; else if (is_data_mapping(flags)) mm->data_vm += npages; + else if (is_shadow_stack_mapping(flags)) + mm->stack_vm += npages; } static vm_fault_t special_mapping_fault(struct vm_fault *vmf); From patchwork Tue Apr 27 20:43:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6692C43461 for ; Tue, 27 Apr 2021 20:44:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8D142613BC for ; Tue, 27 Apr 2021 20:44:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8D142613BC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D73956B0088; Tue, 27 Apr 2021 16:44:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CD2C86B0089; Tue, 27 Apr 2021 16:44:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B29716B008A; Tue, 27 Apr 2021 16:44:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0093.hostedemail.com [216.40.44.93]) by kanga.kvack.org (Postfix) with ESMTP id 79BC46B0089 for ; Tue, 27 Apr 2021 16:44:21 -0400 (EDT) Received: from smtpin32.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 39F6C181AEF00 for ; Tue, 27 Apr 2021 20:44:21 +0000 (UTC) X-FDA: 78079324722.32.763D4EC Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf25.hostedemail.com (Postfix) with ESMTP id F09DC6000112 for ; Tue, 27 Apr 2021 20:44:15 +0000 (UTC) IronPort-SDR: eI3cjaJX/OC60JGa7rlA88Q+xHci4ujtF9U44Ih54veMb93SgxPEPUSpiONTf9WqVLlGSrmqbI xJb/NGTRr78g== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473697" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473697" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:19 -0700 IronPort-SDR: Y3eIhPtaBE1OHDJhM0C9YozO+hBI0cR0zf+ufyIl8shy1LMKbOcEAYlkNovrvQtP73wiNQ9ubD daQLBw43M/mg== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623532" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:18 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 19/30] mm: Update can_follow_write_pte() for shadow stack Date: Tue, 27 Apr 2021 13:43:04 -0700 Message-Id: <20210427204315.24153-20-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: F09DC6000112 X-Stat-Signature: 3id5qfm6ni5h1xap5u6kpbmdfbcr9xac Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf25; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556255-122090 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Can_follow_write_pte() ensures a read-only page is COWed by checking the FOLL_COW flag, and uses pte_dirty() to validate the flag is still valid. Like a writable data page, a shadow stack page is writable, and becomes read-only during copy-on-write, but it is always dirty. Thus, in the can_follow_write_pte() check, it belongs to the writable page case and should be excluded from the read-only page pte_dirty() check. Apply the same changes to can_follow_write_pmd(). While at it, also split the long line into smaller ones. Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov Cc: Kees Cook --- v26: - Instead of passing vm_flags, pass down vma pointer to can_follow_write_*(). v25: - Split long line into smaller ones. v24: - Change arch_shadow_stack_mapping() to is_shadow_stack_mapping(). mm/gup.c | 16 ++++++++++++---- mm/huge_memory.c | 16 ++++++++++++---- 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index ef7d2da9f03f..f9705281e853 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -356,10 +356,18 @@ static int follow_pfn_pte(struct vm_area_struct *vma, unsigned long address, * FOLL_FORCE can write to even unwritable pte's, but only * after we've gone through a COW cycle and they are dirty. */ -static inline bool can_follow_write_pte(pte_t pte, unsigned int flags) +static inline bool can_follow_write_pte(pte_t pte, unsigned int flags, + struct vm_area_struct *vma) { - return pte_write(pte) || - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte)); + if (pte_write(pte)) + return true; + if ((flags & (FOLL_FORCE | FOLL_COW)) != (FOLL_FORCE | FOLL_COW)) + return false; + if (!pte_dirty(pte)) + return false; + if (is_shadow_stack_mapping(vma->vm_flags)) + return false; + return true; } static struct page *follow_page_pte(struct vm_area_struct *vma, @@ -402,7 +410,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, } if ((flags & FOLL_NUMA) && pte_protnone(pte)) goto no_page; - if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags)) { + if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags, vma)) { pte_unmap_unlock(ptep, ptl); return NULL; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 044029ef45cd..cf10c3822853 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1338,10 +1338,18 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) * FOLL_FORCE can write to even unwritable pmd's, but only * after we've gone through a COW cycle and they are dirty. */ -static inline bool can_follow_write_pmd(pmd_t pmd, unsigned int flags) +static inline bool can_follow_write_pmd(pmd_t pmd, unsigned int flags, + struct vm_area_struct *vma) { - return pmd_write(pmd) || - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pmd_dirty(pmd)); + if (pmd_write(pmd)) + return true; + if ((flags & (FOLL_FORCE | FOLL_COW)) != (FOLL_FORCE | FOLL_COW)) + return false; + if (!pmd_dirty(pmd)) + return false; + if (is_shadow_stack_mapping(vma->vm_flags)) + return false; + return true; } struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, @@ -1354,7 +1362,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, assert_spin_locked(pmd_lockptr(mm, pmd)); - if (flags & FOLL_WRITE && !can_follow_write_pmd(*pmd, flags)) + if (flags & FOLL_WRITE && !can_follow_write_pmd(*pmd, flags, vma)) goto out; /* Avoid dumping huge zero page */ From patchwork Tue Apr 27 20:43:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227263 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B8EDC433ED for ; Tue, 27 Apr 2021 20:44:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AB48F60720 for ; Tue, 27 Apr 2021 20:44:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AB48F60720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5F6D56B0089; Tue, 27 Apr 2021 16:44:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 582146B008A; Tue, 27 Apr 2021 16:44:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35CF16B008C; Tue, 27 Apr 2021 16:44:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0122.hostedemail.com [216.40.44.122]) by kanga.kvack.org (Postfix) with ESMTP id 09F656B0089 for ; Tue, 27 Apr 2021 16:44:22 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id ABF1152AE for ; Tue, 27 Apr 2021 20:44:21 +0000 (UTC) X-FDA: 78079324722.11.F54E916 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf21.hostedemail.com (Postfix) with ESMTP id AD409E00011A for ; Tue, 27 Apr 2021 20:44:17 +0000 (UTC) IronPort-SDR: q/GRQedJZPkOGjo7kN/LCtUiZBTzndC8OOU49ypN/8sTpOJOWE2nkormE0rX8gWMv3XV/4NM32 5iwp2pL8BnjA== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473701" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473701" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:20 -0700 IronPort-SDR: Joo7589v8RQFTBvQzvuNM/X3aTnf2/jdVcBYeVk2Ys7N8y3DoL6PpG9QEoSq8kJHk9R3o1lsKp hnpuSNTa3X/w== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623535" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:19 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 20/30] mm/mprotect: Exclude shadow stack from preserve_write Date: Tue, 27 Apr 2021 13:43:05 -0700 Message-Id: <20210427204315.24153-21-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Stat-Signature: ue3yrc6gif75sosqtpzwtord5iu9x16j X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: AD409E00011A Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556257-311618 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In change_pte_range(), when a PTE is changed for prot_numa, _PAGE_RW is preserved to avoid the additional write fault after the NUMA hinting fault. However, pte_write() now includes both normal writable and shadow stack (RW=0, Dirty=1) PTEs, but the latter does not have _PAGE_RW and has no need to preserve it. Exclude shadow stack from preserve_write test, and apply the same change to change_huge_pmd(). Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov --- v25: - Move is_shadow_stack_mapping() to a separate line. v24: - Change arch_shadow_stack_mapping() to is_shadow_stack_mapping(). mm/huge_memory.c | 7 +++++++ mm/mprotect.c | 7 +++++++ 2 files changed, 14 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index cf10c3822853..3b841d7efd13 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1824,6 +1824,13 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, return 0; preserve_write = prot_numa && pmd_write(*pmd); + + /* + * Preserve only normal writable huge PMD, but not shadow + * stack (RW=0, Dirty=1). + */ + if (is_shadow_stack_mapping(vma->vm_flags)) + preserve_write = false; ret = 1; #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION diff --git a/mm/mprotect.c b/mm/mprotect.c index c1ce78d688b6..3b2f0d75519f 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -77,6 +77,13 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, pte_t ptent; bool preserve_write = prot_numa && pte_write(oldpte); + /* + * Preserve only normal writable PTE, but not shadow + * stack (RW=0, Dirty=1). + */ + if (is_shadow_stack_mapping(vma->vm_flags)) + preserve_write = false; + /* * Avoid trapping faults against the zero or KSM * pages. See similar comment in change_huge_pmd. From patchwork Tue Apr 27 20:43:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227265 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DF67C43611 for ; Tue, 27 Apr 2021 20:44:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BFBDA60720 for ; Tue, 27 Apr 2021 20:44:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BFBDA60720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D708A6B008C; Tue, 27 Apr 2021 16:44:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D23206B0093; Tue, 27 Apr 2021 16:44:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AD8B96B0092; Tue, 27 Apr 2021 16:44:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0064.hostedemail.com [216.40.44.64]) by kanga.kvack.org (Postfix) with ESMTP id 634F46B008C for ; Tue, 27 Apr 2021 16:44:22 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 20DB7246E for ; Tue, 27 Apr 2021 20:44:22 +0000 (UTC) X-FDA: 78079324764.23.B75C961 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf25.hostedemail.com (Postfix) with ESMTP id B4BE0600010F for ; Tue, 27 Apr 2021 20:44:16 +0000 (UTC) IronPort-SDR: 1FfMRy7k5keSvgYSW+Dhqml3Be3Q6U8FM0b3TPytzWb/9OGSCJMdeQUw5yhVuo1lYUOer8sQU8 rNqfUBJXa5cQ== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473705" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473705" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:20 -0700 IronPort-SDR: HR7ve9IwBTwRZRfRoPA9h33DhjwH/KZ4VjrmQmDONXSVEgEbzojR50sPTdPfUIWAlO2EDAH91t ldt1kaKqujRQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623540" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:20 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , Peter Collingbourne , "Kirill A . Shutemov" , Andrew Morton Subject: [PATCH v26 21/30] mm: Re-introduce vm_flags to do_mmap() Date: Tue, 27 Apr 2021 13:43:06 -0700 Message-Id: <20210427204315.24153-22-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: B4BE0600010F X-Stat-Signature: moyeg4t7fmhtyqu5qwjhqb7supj3woj7 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf25; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556256-814712 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There was no more caller passing vm_flags to do_mmap(), and vm_flags was removed from the function's input by: commit 45e55300f114 ("mm: remove unnecessary wrapper function do_mmap_pgoff()"). There is a new user now. Shadow stack allocation passes VM_SHADOW_STACK to do_mmap(). Thus, re-introduce vm_flags to do_mmap(). Signed-off-by: Yu-cheng Yu Reviewed-by: Peter Collingbourne Reviewed-by: Kees Cook Reviewed-by: Kirill A. Shutemov Cc: Andrew Morton Cc: Oleg Nesterov Cc: linux-mm@kvack.org --- v24: - Change VM_SHSTK to VM_SHADOW_STACK. - Update commit log. fs/aio.c | 2 +- include/linux/mm.h | 3 ++- ipc/shm.c | 2 +- mm/mmap.c | 10 +++++----- mm/nommu.c | 4 ++-- mm/util.c | 2 +- 6 files changed, 12 insertions(+), 11 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 1f32da13d39e..b5d0586209a7 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -529,7 +529,7 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events) ctx->mmap_base = do_mmap(ctx->aio_ring_file, 0, ctx->mmap_size, PROT_READ | PROT_WRITE, - MAP_SHARED, 0, &unused, NULL); + MAP_SHARED, 0, 0, &unused, NULL); mmap_write_unlock(mm); if (IS_ERR((void *)ctx->mmap_base)) { ctx->mmap_size = 0; diff --git a/include/linux/mm.h b/include/linux/mm.h index 3e9c84f21ef6..1ccec5cc399b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2576,7 +2576,8 @@ extern unsigned long mmap_region(struct file *file, unsigned long addr, struct list_head *uf); extern unsigned long do_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flags, - unsigned long pgoff, unsigned long *populate, struct list_head *uf); + vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, + struct list_head *uf); extern int __do_munmap(struct mm_struct *, unsigned long, size_t, struct list_head *uf, bool downgrade); extern int do_munmap(struct mm_struct *, unsigned long, size_t, diff --git a/ipc/shm.c b/ipc/shm.c index febd88daba8c..b6370eb1eaab 100644 --- a/ipc/shm.c +++ b/ipc/shm.c @@ -1556,7 +1556,7 @@ long do_shmat(int shmid, char __user *shmaddr, int shmflg, goto invalid; } - addr = do_mmap(file, addr, size, prot, flags, 0, &populate, NULL); + addr = do_mmap(file, addr, size, prot, flags, 0, 0, &populate, NULL); *raddr = addr; err = 0; if (IS_ERR_VALUE(addr)) diff --git a/mm/mmap.c b/mm/mmap.c index d77fb39b6ab5..7b2992ef8ee0 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1401,11 +1401,11 @@ static inline bool file_mmap_ok(struct file *file, struct inode *inode, */ unsigned long do_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, - unsigned long flags, unsigned long pgoff, - unsigned long *populate, struct list_head *uf) + unsigned long flags, vm_flags_t vm_flags, + unsigned long pgoff, unsigned long *populate, + struct list_head *uf) { struct mm_struct *mm = current->mm; - vm_flags_t vm_flags; int pkey = 0; *populate = 0; @@ -1467,7 +1467,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, * to. we assume access permissions have been handled by the open * of the memory object, so we don't do any here. */ - vm_flags = calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) | + vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) | mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; if (flags & MAP_LOCKED) @@ -3047,7 +3047,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, file = get_file(vma->vm_file); ret = do_mmap(vma->vm_file, start, size, - prot, flags, pgoff, &populate, NULL); + prot, flags, 0, pgoff, &populate, NULL); fput(file); out: mmap_write_unlock(mm); diff --git a/mm/nommu.c b/mm/nommu.c index 5c9ab799c0e6..9b6f7a1895c2 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -1071,6 +1071,7 @@ unsigned long do_mmap(struct file *file, unsigned long len, unsigned long prot, unsigned long flags, + vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, struct list_head *uf) @@ -1078,7 +1079,6 @@ unsigned long do_mmap(struct file *file, struct vm_area_struct *vma; struct vm_region *region; struct rb_node *rb; - vm_flags_t vm_flags; unsigned long capabilities, result; int ret; @@ -1097,7 +1097,7 @@ unsigned long do_mmap(struct file *file, /* we've determined that we can make the mapping, now translate what we * now know into VMA flags */ - vm_flags = determine_vm_flags(file, prot, flags, capabilities); + vm_flags |= determine_vm_flags(file, prot, flags, capabilities); /* we're going to need to record the mapping */ region = kmem_cache_zalloc(vm_region_jar, GFP_KERNEL); diff --git a/mm/util.c b/mm/util.c index 54870226cea6..49cbd4400d13 100644 --- a/mm/util.c +++ b/mm/util.c @@ -516,7 +516,7 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr, if (!ret) { if (mmap_write_lock_killable(mm)) return -EINTR; - ret = do_mmap(file, addr, len, prot, flag, pgoff, &populate, + ret = do_mmap(file, addr, len, prot, flag, 0, pgoff, &populate, &uf); mmap_write_unlock(mm); userfaultfd_unmap_complete(mm, &uf); From patchwork Tue Apr 27 20:43:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227267 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BB56C43461 for ; Tue, 27 Apr 2021 20:44:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BF5B9613E5 for ; Tue, 27 Apr 2021 20:44:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BF5B9613E5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 58AAE6B008A; Tue, 27 Apr 2021 16:44:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 499566B0092; Tue, 27 Apr 2021 16:44:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29BBB6B0093; Tue, 27 Apr 2021 16:44:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0164.hostedemail.com [216.40.44.164]) by kanga.kvack.org (Postfix) with ESMTP id CCE5F6B008A for ; Tue, 27 Apr 2021 16:44:22 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8F5558249980 for ; Tue, 27 Apr 2021 20:44:22 +0000 (UTC) X-FDA: 78079324764.13.83FFD53 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf21.hostedemail.com (Postfix) with ESMTP id 89CD7E00011A for ; Tue, 27 Apr 2021 20:44:18 +0000 (UTC) IronPort-SDR: QyEK6LjcYxML3UiAdHCCi8qo3+sV9+3QafHF+ovS9hyKjTJR1o2QRVXf7IG1ehfhw9OwfjHQsg 1zo89eLVid/A== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473712" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473712" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:21 -0700 IronPort-SDR: N4djjsuTsb36X0V7S7jVHy8u4sXq9sOo9yzaroGzj+/EuG9F9TACm8VLdv+KQYpluiiml3bZ5i 3eTEeslK0DIQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623544" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:20 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu Subject: [PATCH v26 22/30] x86/cet/shstk: Add user-mode shadow stack support Date: Tue, 27 Apr 2021 13:43:07 -0700 Message-Id: <20210427204315.24153-23-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Stat-Signature: 9n76bw1sncqqeoa9xsozt5jqcpy7ihe7 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 89CD7E00011A Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556258-508804 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Introduce basic shadow stack enabling/disabling/allocation routines. A task's shadow stack is allocated from memory with VM_SHADOW_STACK flag and has a fixed size of min(RLIMIT_STACK, 4GB). Signed-off-by: Yu-cheng Yu Cc: Kees Cook --- v25: - Change CONFIG_X86_CET to CONFIG_X86_SHADOW_STACK. - Update alloc_shstk(), remove unused input flags. v24: - Rename cet.c to shstk.c, update related areas accordingly. arch/x86/include/asm/cet.h | 29 ++++++++ arch/x86/include/asm/processor.h | 5 ++ arch/x86/kernel/Makefile | 2 + arch/x86/kernel/shstk.c | 123 +++++++++++++++++++++++++++++++ 4 files changed, 159 insertions(+) create mode 100644 arch/x86/include/asm/cet.h create mode 100644 arch/x86/kernel/shstk.c diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h new file mode 100644 index 000000000000..aa85d599b184 --- /dev/null +++ b/arch/x86/include/asm/cet.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_CET_H +#define _ASM_X86_CET_H + +#ifndef __ASSEMBLY__ +#include + +struct task_struct; +/* + * Per-thread CET status + */ +struct cet_status { + unsigned long shstk_base; + unsigned long shstk_size; +}; + +#ifdef CONFIG_X86_SHADOW_STACK +int shstk_setup(void); +void shstk_free(struct task_struct *p); +void shstk_disable(void); +#else +static inline int shstk_setup(void) { return 0; } +static inline void shstk_free(struct task_struct *p) {} +static inline void shstk_disable(void) {} +#endif + +#endif /* __ASSEMBLY__ */ + +#endif /* _ASM_X86_CET_H */ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index f1b9ed5efaa9..3690b78e55bb 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -27,6 +27,7 @@ struct vm86; #include #include #include +#include #include #include @@ -535,6 +536,10 @@ struct thread_struct { unsigned int sig_on_uaccess_err:1; +#ifdef CONFIG_X86_SHADOW_STACK + struct cet_status cet; +#endif + /* Floating point and extended processor state */ struct fpu fpu; /* diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 2ddf08351f0b..0f99b093f350 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -150,6 +150,8 @@ obj-$(CONFIG_UNWINDER_FRAME_POINTER) += unwind_frame.o obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev-es.o +obj-$(CONFIG_X86_SHADOW_STACK) += shstk.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c new file mode 100644 index 000000000000..c815c7507830 --- /dev/null +++ b/arch/x86/kernel/shstk.c @@ -0,0 +1,123 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * shstk.c - Intel shadow stack support + * + * Copyright (c) 2021, Intel Corporation. + * Yu-cheng Yu + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static void start_update_msrs(void) +{ + fpregs_lock(); + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + __fpregs_load_activate(); +} + +static void end_update_msrs(void) +{ + fpregs_unlock(); +} + +static unsigned long alloc_shstk(unsigned long size) +{ + struct mm_struct *mm = current->mm; + unsigned long addr, populate; + int flags = MAP_ANONYMOUS | MAP_PRIVATE; + + mmap_write_lock(mm); + addr = do_mmap(NULL, 0, size, PROT_READ, flags, VM_SHADOW_STACK, 0, + &populate, NULL); + mmap_write_unlock(mm); + + return addr; +} + +int shstk_setup(void) +{ + unsigned long addr, size; + struct cet_status *cet = ¤t->thread.cet; + + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return -EOPNOTSUPP; + + size = round_up(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G), PAGE_SIZE); + addr = alloc_shstk(size); + if (IS_ERR_VALUE(addr)) + return PTR_ERR((void *)addr); + + cet->shstk_base = addr; + cet->shstk_size = size; + + start_update_msrs(); + wrmsrl(MSR_IA32_PL3_SSP, addr + size); + wrmsrl(MSR_IA32_U_CET, CET_SHSTK_EN); + end_update_msrs(); + return 0; +} + +void shstk_free(struct task_struct *tsk) +{ + struct cet_status *cet = &tsk->thread.cet; + + if (!cpu_feature_enabled(X86_FEATURE_SHSTK) || + !cet->shstk_size || + !cet->shstk_base) + return; + + if (!tsk->mm) + return; + + while (1) { + int r; + + r = vm_munmap(cet->shstk_base, cet->shstk_size); + + /* + * vm_munmap() returns -EINTR when mmap_lock is held by + * something else, and that lock should not be held for a + * long time. Retry it for the case. + */ + if (r == -EINTR) { + cond_resched(); + continue; + } + break; + } + + cet->shstk_base = 0; + cet->shstk_size = 0; +} + +void shstk_disable(void) +{ + struct cet_status *cet = ¤t->thread.cet; + u64 msr_val; + + if (!cpu_feature_enabled(X86_FEATURE_SHSTK) || + !cet->shstk_size || + !cet->shstk_base) + return; + + start_update_msrs(); + rdmsrl(MSR_IA32_U_CET, msr_val); + wrmsrl(MSR_IA32_U_CET, msr_val & ~CET_SHSTK_EN); + wrmsrl(MSR_IA32_PL3_SSP, 0); + end_update_msrs(); + + shstk_free(current); +} From patchwork Tue Apr 27 20:43:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227269 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 330D4C43616 for ; Tue, 27 Apr 2021 20:44:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CF1A0613F3 for ; Tue, 27 Apr 2021 20:44:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CF1A0613F3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1B0B56B0092; Tue, 27 Apr 2021 16:44:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 14E5E6B0093; Tue, 27 Apr 2021 16:44:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE9156B0095; Tue, 27 Apr 2021 16:44:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0213.hostedemail.com [216.40.44.213]) by kanga.kvack.org (Postfix) with ESMTP id 993906B0092 for ; Tue, 27 Apr 2021 16:44:24 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5D6C94417 for ; Tue, 27 Apr 2021 20:44:24 +0000 (UTC) X-FDA: 78079324848.20.034D524 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf27.hostedemail.com (Postfix) with ESMTP id 0A68F80192EA for ; Tue, 27 Apr 2021 20:44:01 +0000 (UTC) IronPort-SDR: EQBRqh6lgD9QMVege4ev52LKCTTxQwVJbANJQOJBDEBTQFlBrkD/4BorMnh4RXGioe5vM6vx4X ZC/Dq21iS3Xw== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473717" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473717" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:22 -0700 IronPort-SDR: 75VLjNgq4HBl3JHUeKIoj71GJcm5Gbcd3TtQYTMTQH/p9nAv02K9RPUsZm2A3RWacsigMkHh5P jeDPO8y3iV0g== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623548" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:21 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu Subject: [PATCH v26 23/30] x86/cet/shstk: Handle thread shadow stack Date: Tue, 27 Apr 2021 13:43:08 -0700 Message-Id: <20210427204315.24153-24-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 0A68F80192EA X-Stat-Signature: jutsxzhb875o7cndti8hxna6iywk8ud5 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf27; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556241-667163 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: For clone() with CLONE_VM specified, the child and the parent must have separate shadow stacks. Thus, the kernel allocates, and frees on thread exit a new shadow stack for the child. Use stack_size passed from clone3() syscall for thread shadow stack size, but cap it to min(RLIMIT_STACK, 4 GB). A compat-mode thread shadow stack size is further reduced to 1/4. This allows more threads to run in a 32- bit address space. Signed-off-by: Yu-cheng Yu --- arch/x86/include/asm/cet.h | 5 +++ arch/x86/include/asm/mmu_context.h | 3 ++ arch/x86/kernel/process.c | 15 ++++++-- arch/x86/kernel/shstk.c | 57 +++++++++++++++++++++++++++++- 4 files changed, 76 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h index aa85d599b184..8b83ded577cc 100644 --- a/arch/x86/include/asm/cet.h +++ b/arch/x86/include/asm/cet.h @@ -16,10 +16,15 @@ struct cet_status { #ifdef CONFIG_X86_SHADOW_STACK int shstk_setup(void); +int shstk_setup_thread(struct task_struct *p, unsigned long clone_flags, + unsigned long stack_size); void shstk_free(struct task_struct *p); void shstk_disable(void); #else static inline int shstk_setup(void) { return 0; } +static inline int shstk_setup_thread(struct task_struct *p, + unsigned long clone_flags, + unsigned long stack_size) { return 0; } static inline void shstk_free(struct task_struct *p) {} static inline void shstk_disable(void) {} #endif diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 27516046117a..53569114aa01 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -11,6 +11,7 @@ #include #include +#include #include extern atomic64_t last_mm_ctx_id; @@ -146,6 +147,8 @@ do { \ #else #define deactivate_mm(tsk, mm) \ do { \ + if (!tsk->vfork_done) \ + shstk_free(tsk); \ load_gs_index(0); \ loadsegment(fs, 0); \ } while (0) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 9c214d7085a4..fa01e8679d01 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -43,6 +43,7 @@ #include #include #include +#include #include "process.h" @@ -109,6 +110,7 @@ void exit_thread(struct task_struct *tsk) free_vm86(t); + shstk_free(tsk); fpu__drop(fpu); } @@ -122,8 +124,9 @@ static int set_new_tls(struct task_struct *p, unsigned long tls) return do_set_thread_area_64(p, ARCH_SET_FS, tls); } -int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg, - struct task_struct *p, unsigned long tls) +int copy_thread(unsigned long clone_flags, unsigned long sp, + unsigned long stack_size, struct task_struct *p, + unsigned long tls) { struct inactive_task_frame *frame; struct fork_frame *fork_frame; @@ -163,7 +166,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg, /* Kernel thread ? */ if (unlikely(p->flags & (PF_KTHREAD | PF_IO_WORKER))) { memset(childregs, 0, sizeof(struct pt_regs)); - kthread_frame_init(frame, sp, arg); + kthread_frame_init(frame, sp, stack_size); return 0; } @@ -181,6 +184,12 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg, if (clone_flags & CLONE_SETTLS) ret = set_new_tls(p, tls); +#ifdef CONFIG_X86_64 + /* Allocate a new shadow stack for pthread */ + if (!ret) + ret = shstk_setup_thread(p, clone_flags, stack_size); +#endif + if (!ret && unlikely(test_tsk_thread_flag(current, TIF_IO_BITMAP))) io_bitmap_share(p); diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c index c815c7507830..d387df84b7f1 100644 --- a/arch/x86/kernel/shstk.c +++ b/arch/x86/kernel/shstk.c @@ -70,6 +70,55 @@ int shstk_setup(void) return 0; } +int shstk_setup_thread(struct task_struct *tsk, unsigned long clone_flags, + unsigned long stack_size) +{ + unsigned long addr, size; + struct cet_user_state *state; + struct cet_status *cet = &tsk->thread.cet; + + if (!cet->shstk_size) + return 0; + + if ((clone_flags & (CLONE_VFORK | CLONE_VM)) != CLONE_VM) + return 0; + + state = get_xsave_addr(&tsk->thread.fpu.state.xsave, + XFEATURE_CET_USER); + + if (!state) + return -EINVAL; + + if (stack_size == 0) + return -EINVAL; + + /* Cap shadow stack size to 4 GB */ + size = min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G); + size = min(size, stack_size); + + /* + * Compat-mode pthreads share a limited address space. + * If each function call takes an average of four slots + * stack space, allocate 1/4 of stack size for shadow stack. + */ + if (in_compat_syscall()) + size /= 4; + size = round_up(size, PAGE_SIZE); + addr = alloc_shstk(size); + + if (IS_ERR_VALUE(addr)) { + cet->shstk_base = 0; + cet->shstk_size = 0; + return PTR_ERR((void *)addr); + } + + fpu__prepare_write(&tsk->thread.fpu); + state->user_ssp = (u64)(addr + size); + cet->shstk_base = addr; + cet->shstk_size = size; + return 0; +} + void shstk_free(struct task_struct *tsk) { struct cet_status *cet = &tsk->thread.cet; @@ -79,7 +128,13 @@ void shstk_free(struct task_struct *tsk) !cet->shstk_base) return; - if (!tsk->mm) + /* + * When fork() with CLONE_VM fails, the child (tsk) already has a + * shadow stack allocated, and exit_thread() calls this function to + * free it. In this case the parent (current) and the child is + * sharing the same mm struct. + */ + if (!tsk->mm || tsk->mm != current->mm) return; while (1) { From patchwork Tue Apr 27 20:43:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227271 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72B77C43460 for ; Tue, 27 Apr 2021 20:45:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F0D50613E7 for ; Tue, 27 Apr 2021 20:45:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F0D50613E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A40D06B0093; Tue, 27 Apr 2021 16:44:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 99EC36B0095; Tue, 27 Apr 2021 16:44:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A2EA6B0096; Tue, 27 Apr 2021 16:44:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0163.hostedemail.com [216.40.44.163]) by kanga.kvack.org (Postfix) with ESMTP id 52F506B0093 for ; Tue, 27 Apr 2021 16:44:25 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0ED42180AD81A for ; Tue, 27 Apr 2021 20:44:25 +0000 (UTC) X-FDA: 78079324890.01.015C011 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf27.hostedemail.com (Postfix) with ESMTP id A543480192D4 for ; Tue, 27 Apr 2021 20:44:02 +0000 (UTC) IronPort-SDR: Gf1PZJJsJsg3UlIn703BnnlJR2UN1xtYTEVC2mdaPl8V8eaVxUVpfwuPs3ir+mpvrgI0LQoqg2 HpaK6fPEvQLg== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194473720" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="194473720" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:22 -0700 IronPort-SDR: qTt0V0V7d3JR1JYWKWQfR06T/CyMY+mWsWEwYivGU0EXnElIqhBL5NlcCZ9//aCZgjfJrtNRLU nvYZ6siDBSGQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623551" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:22 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu Subject: [PATCH v26 24/30] x86/cet/shstk: Introduce shadow stack token setup/verify routines Date: Tue, 27 Apr 2021 13:43:09 -0700 Message-Id: <20210427204315.24153-25-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: A543480192D4 X-Stat-Signature: pugkbk8onotjeizmd8tgz7qetqd4qkou Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf27; identity=mailfrom; envelope-from=""; helo=mga04.intel.com; client-ip=192.55.52.120 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556242-225633 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A shadow stack restore token marks a restore point of the shadow stack, and the address in a token must point directly above the token, which is within the same shadow stack. This is distinctively different from other pointers on the shadow stack, since those pointers point to executable code area. The restore token can be used as an extra protection for signal handling. To deliver a signal, create a shadow stack restore token and put the token and the signal restorer address on the shadow stack. In sigreturn, verify the token and restore from it the shadow stack pointer. Introduce token setup and verify routines. Also introduce WRUSS, which is a kernel-mode instruction but writes directly to user shadow stack. It is used to construct user signal stack as described above. Signed-off-by: Yu-cheng Yu Cc: Kees Cook --- v25: - Update inline assembly syntax, use %[]. - Change token address from (unsigned long) to (u64/u32 __user *). - Change -EPERM to -EFAULT. arch/x86/include/asm/cet.h | 9 ++ arch/x86/include/asm/special_insns.h | 32 +++++++ arch/x86/kernel/shstk.c | 126 +++++++++++++++++++++++++++ 3 files changed, 167 insertions(+) diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h index 8b83ded577cc..ef6155213b7e 100644 --- a/arch/x86/include/asm/cet.h +++ b/arch/x86/include/asm/cet.h @@ -20,6 +20,10 @@ int shstk_setup_thread(struct task_struct *p, unsigned long clone_flags, unsigned long stack_size); void shstk_free(struct task_struct *p); void shstk_disable(void); +int shstk_setup_rstor_token(bool ia32, unsigned long rstor, + unsigned long *token_addr, unsigned long *new_ssp); +int shstk_check_rstor_token(bool ia32, unsigned long token_addr, + unsigned long *new_ssp); #else static inline int shstk_setup(void) { return 0; } static inline int shstk_setup_thread(struct task_struct *p, @@ -27,6 +31,11 @@ static inline int shstk_setup_thread(struct task_struct *p, unsigned long stack_size) { return 0; } static inline void shstk_free(struct task_struct *p) {} static inline void shstk_disable(void) {} +static inline int shstk_setup_rstor_token(bool ia32, unsigned long rstor, + unsigned long *token_addr, + unsigned long *new_ssp) { return 0; } +static inline int shstk_check_rstor_token(bool ia32, unsigned long token_addr, + unsigned long *new_ssp) { return 0; } #endif #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 1d3cbaef4bb7..5a0488923cae 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -234,6 +234,38 @@ static inline void clwb(volatile void *__p) : [pax] "a" (p)); } +#ifdef CONFIG_X86_SHADOW_STACK +#if defined(CONFIG_IA32_EMULATION) || defined(CONFIG_X86_X32) +static inline int write_user_shstk_32(u32 __user *addr, u32 val) +{ + asm_volatile_goto("1: wrussd %[val], (%[addr])\n" + _ASM_EXTABLE(1b, %l[fail]) + :: [addr] "r" (addr), [val] "r" (val) + :: fail); + return 0; +fail: + return -EFAULT; +} +#else +static inline int write_user_shstk_32(u32 __user *addr, u32 val) +{ + WARN_ONCE(1, "%s used but not supported.\n", __func__); + return -EFAULT; +} +#endif + +static inline int write_user_shstk_64(u64 __user *addr, u64 val) +{ + asm_volatile_goto("1: wrussq %[val], (%[addr])\n" + _ASM_EXTABLE(1b, %l[fail]) + :: [addr] "r" (addr), [val] "r" (val) + :: fail); + return 0; +fail: + return -EFAULT; +} +#endif /* CONFIG_X86_SHADOW_STACK */ + #define nop() asm volatile ("nop") static inline void serialize(void) diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c index d387df84b7f1..48a0c87414ef 100644 --- a/arch/x86/kernel/shstk.c +++ b/arch/x86/kernel/shstk.c @@ -20,6 +20,7 @@ #include #include #include +#include static void start_update_msrs(void) { @@ -176,3 +177,128 @@ void shstk_disable(void) shstk_free(current); } + +static unsigned long _get_user_shstk_addr(void) +{ + struct fpu *fpu = ¤t->thread.fpu; + unsigned long ssp = 0; + + fpregs_lock(); + + if (fpregs_state_valid(fpu, smp_processor_id())) { + rdmsrl(MSR_IA32_PL3_SSP, ssp); + } else { + struct cet_user_state *p; + + p = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER); + if (p) + ssp = p->user_ssp; + } + + fpregs_unlock(); + return ssp; +} + +#define TOKEN_MODE_MASK 3UL +#define TOKEN_MODE_64 1UL +#define IS_TOKEN_64(token) (((token) & TOKEN_MODE_MASK) == TOKEN_MODE_64) +#define IS_TOKEN_32(token) (((token) & TOKEN_MODE_MASK) == 0) + +/* + * Create a restore token on the shadow stack. A token is always 8-byte + * and aligned to 8. + */ +static int _create_rstor_token(bool ia32, unsigned long ssp, + unsigned long *token_addr) +{ + unsigned long addr; + + *token_addr = 0; + + if ((!ia32 && !IS_ALIGNED(ssp, 8)) || !IS_ALIGNED(ssp, 4)) + return -EINVAL; + + addr = ALIGN_DOWN(ssp, 8) - 8; + + /* Is the token for 64-bit? */ + if (!ia32) + ssp |= TOKEN_MODE_64; + + if (write_user_shstk_64((u64 __user *)addr, (u64)ssp)) + return -EFAULT; + + *token_addr = addr; + return 0; +} + +/* + * Create a restore token on shadow stack, and then push the user-mode + * function return address. + */ +int shstk_setup_rstor_token(bool ia32, unsigned long ret_addr, + unsigned long *token_addr, unsigned long *new_ssp) +{ + struct cet_status *cet = ¤t->thread.cet; + unsigned long ssp = 0; + int err = 0; + + if (cet->shstk_size) { + if (!ret_addr) + return -EINVAL; + + ssp = _get_user_shstk_addr(); + err = _create_rstor_token(ia32, ssp, token_addr); + if (err) + return err; + + if (ia32) { + *new_ssp = *token_addr - sizeof(u32); + err = write_user_shstk_32((u32 __user *)*new_ssp, (u32)ret_addr); + } else { + *new_ssp = *token_addr - sizeof(u64); + err = write_user_shstk_64((u64 __user *)*new_ssp, (u64)ret_addr); + } + } + + return err; +} + +/* + * Verify token_addr point to a valid token, and then set *new_ssp + * according to the token. + */ +int shstk_check_rstor_token(bool ia32, unsigned long token_addr, unsigned long *new_ssp) +{ + unsigned long token; + + *new_ssp = 0; + + if (!IS_ALIGNED(token_addr, 8)) + return -EINVAL; + + if (get_user(token, (unsigned long __user *)token_addr)) + return -EFAULT; + + /* Is 64-bit mode flag correct? */ + if (!ia32 && !IS_TOKEN_64(token)) + return -EINVAL; + else if (ia32 && !IS_TOKEN_32(token)) + return -EINVAL; + + token &= ~TOKEN_MODE_MASK; + + /* + * Restore address properly aligned? + */ + if ((!ia32 && !IS_ALIGNED(token, 8)) || !IS_ALIGNED(token, 4)) + return -EINVAL; + + /* + * Token was placed properly? + */ + if (((ALIGN_DOWN(token, 8) - 8) != token_addr) || token >= TASK_SIZE_MAX) + return -EINVAL; + + *new_ssp = token; + return 0; +} From patchwork Tue Apr 27 20:43:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227273 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 517B1C43470 for ; Tue, 27 Apr 2021 20:45:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E258360720 for ; Tue, 27 Apr 2021 20:45:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E258360720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4719B6B0095; Tue, 27 Apr 2021 16:44:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3FA536B0096; Tue, 27 Apr 2021 16:44:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D8866B0098; Tue, 27 Apr 2021 16:44:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0087.hostedemail.com [216.40.44.87]) by kanga.kvack.org (Postfix) with ESMTP id E4CF76B0095 for ; Tue, 27 Apr 2021 16:44:25 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A57278249980 for ; Tue, 27 Apr 2021 20:44:25 +0000 (UTC) X-FDA: 78079324890.02.98D40B5 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf09.hostedemail.com (Postfix) with ESMTP id 42E6C6000113 for ; Tue, 27 Apr 2021 20:44:18 +0000 (UTC) IronPort-SDR: vBIgsRpdHckTRqE/edWIecMl6QimoN0/JebSDP3Sje28r8+vxkGIxHXM6GKZbnr1ztOgNDSKMu QvgptRhJ+BhA== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="176706101" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="176706101" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:23 -0700 IronPort-SDR: rM42Qnag36xHnuLTxjiNeven6Z3hacM22stlqYZPZujkhxvTmy46s3gdNx9/rehtNpA6ZNn9gf dVRnV5tUiKig== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623554" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:22 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu Subject: [PATCH v26 25/30] x86/cet/shstk: Handle signals for shadow stack Date: Tue, 27 Apr 2021 13:43:10 -0700 Message-Id: <20210427204315.24153-26-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Stat-Signature: qnicozdjqp8f3y7rotwsmttqcqmw7rb3 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 42E6C6000113 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf09; identity=mailfrom; envelope-from=""; helo=mga17.intel.com; client-ip=192.55.52.151 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556258-902071 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When shadow stack is enabled, a task's shadow stack states must be saved along with the signal context and later restored in sigreturn. However, currently there is no systematic facility for extending a signal context. There is some space left in the ucontext, but changing ucontext is likely to create compatibility issues and there is not enough space for further extensions. Introduce a signal context extension struct 'sc_ext', which is used to save shadow stack restore token address. The extension is located above the fpu states, plus alignment. The struct can be extended (such as the ibt's wait_endbr status to be introduced later), and sc_ext.total_size field keeps track of total size. Introduce routines for the allocation, save, and restore for sc_ext: - fpu__alloc_sigcontext_ext(), - save_extra_state_to_sigframe(), - get_extra_state_from_sigframe(), - restore_extra_state_to_xregs(). Signed-off-by: Yu-cheng Yu Cc: Kees Cook --- v25: - Update commit log/comments for the sc_ext struct. - Use restorer address already calculated. - Change CONFIG_X86_CET to CONFIG_X86_SHADOW_STACK. - Change X86_FEATURE_CET to X86_FEATURE_SHSTK. - Eliminate writing to MSR_IA32_U_CET for shadow stack. - Change wrmsrl() to wrmsrl_safe() and handle error. v24: - Split out shadow stack token routines to a separate patch. - Put signal frame save/restore routines to fpu/signal.c and re-name accordingly. arch/x86/ia32/ia32_signal.c | 24 +++-- arch/x86/include/asm/cet.h | 2 + arch/x86/include/asm/fpu/internal.h | 2 + arch/x86/include/uapi/asm/sigcontext.h | 9 ++ arch/x86/kernel/fpu/signal.c | 137 ++++++++++++++++++++++++- arch/x86/kernel/signal.c | 9 ++ 6 files changed, 172 insertions(+), 11 deletions(-) diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c index 5e3d9b7fd5fb..423abcd181f2 100644 --- a/arch/x86/ia32/ia32_signal.c +++ b/arch/x86/ia32/ia32_signal.c @@ -202,7 +202,8 @@ do { \ */ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size, - void __user **fpstate) + void __user **fpstate, + void __user *restorer) { unsigned long sp, fx_aligned, math_size; @@ -220,6 +221,10 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs, sp = fpu__alloc_mathframe(sp, 1, &fx_aligned, &math_size); *fpstate = (struct _fpstate_32 __user *) sp; + + if (save_extra_state_to_sigframe(1, *fpstate, restorer)) + return (void __user *)-1L; + if (copy_fpstate_to_sigframe(*fpstate, (void __user *)fx_aligned, math_size) < 0) return (void __user *) -1L; @@ -249,8 +254,6 @@ int ia32_setup_frame(int sig, struct ksignal *ksig, 0x80cd, /* int $0x80 */ }; - frame = get_sigframe(ksig, regs, sizeof(*frame), &fp); - if (ksig->ka.sa.sa_flags & SA_RESTORER) { restorer = ksig->ka.sa.sa_restorer; } else { @@ -262,6 +265,8 @@ int ia32_setup_frame(int sig, struct ksignal *ksig, restorer = &frame->retcode; } + frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, restorer); + if (!user_access_begin(frame, sizeof(*frame))) return -EFAULT; @@ -317,7 +322,13 @@ int ia32_setup_rt_frame(int sig, struct ksignal *ksig, 0, }; - frame = get_sigframe(ksig, regs, sizeof(*frame), &fp); + if (ksig->ka.sa.sa_flags & SA_RESTORER) + restorer = ksig->ka.sa.sa_restorer; + else + restorer = current->mm->context.vdso + + vdso_image_32.sym___kernel_rt_sigreturn; + + frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, restorer); if (!user_access_begin(frame, sizeof(*frame))) return -EFAULT; @@ -334,11 +345,6 @@ int ia32_setup_rt_frame(int sig, struct ksignal *ksig, unsafe_put_user(0, &frame->uc.uc_link, Efault); unsafe_compat_save_altstack(&frame->uc.uc_stack, regs->sp, Efault); - if (ksig->ka.sa.sa_flags & SA_RESTORER) - restorer = ksig->ka.sa.sa_restorer; - else - restorer = current->mm->context.vdso + - vdso_image_32.sym___kernel_rt_sigreturn; unsafe_put_user(ptr_to_compat(restorer), &frame->pretcode, Efault); /* diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h index ef6155213b7e..5e66919bd2fe 100644 --- a/arch/x86/include/asm/cet.h +++ b/arch/x86/include/asm/cet.h @@ -6,6 +6,8 @@ #include struct task_struct; +struct sc_ext; + /* * Per-thread CET status */ diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 8d33ad80704f..2a3c6a160dd6 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -443,6 +443,8 @@ static inline void copy_kernel_to_fpregs(union fpregs_state *fpstate) __copy_kernel_to_fpregs(fpstate, -1); } +extern int save_extra_state_to_sigframe(int ia32, void __user *fp, + void __user *restorer); extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size); /* diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h index 844d60eb1882..10d7fa192d48 100644 --- a/arch/x86/include/uapi/asm/sigcontext.h +++ b/arch/x86/include/uapi/asm/sigcontext.h @@ -196,6 +196,15 @@ struct _xstate { /* New processor state extensions go here: */ }; +/* + * Located at the end of sigcontext->fpstate, aligned to 8. + * total_size keeps track of further extension of the struct. + */ +struct sc_ext { + unsigned long total_size; + unsigned long ssp; +}; + /* * The 32-bit signal frame: */ diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c index a4ec65317a7f..0488407bec81 100644 --- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -52,6 +52,107 @@ static inline int check_for_xstate(struct fxregs_state __user *buf, return 0; } +int save_extra_state_to_sigframe(int ia32, void __user *fp, void __user *restorer) +{ + int err = 0; + +#ifdef CONFIG_X86_SHADOW_STACK + struct cet_status *cet = ¤t->thread.cet; + unsigned long token_addr = 0, new_ssp = 0; + struct sc_ext ext = {}; + + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return 0; + + if (cet->shstk_size) { + err = shstk_setup_rstor_token(ia32, (unsigned long)restorer, + &token_addr, &new_ssp); + if (err) + return err; + + ext.ssp = token_addr; + + fpregs_lock(); + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + __fpregs_load_activate(); + if (new_ssp) + err = wrmsrl_safe(MSR_IA32_PL3_SSP, new_ssp); + fpregs_unlock(); + } + + if (!err && ext.ssp) { + void __user *p = fp; + + ext.total_size = sizeof(ext); + + p = fp; + if (ia32) + p += sizeof(struct fregs_state); + + p += fpu_user_xstate_size + FP_XSTATE_MAGIC2_SIZE; + p = (void __user *)ALIGN((unsigned long)p, 8); + + if (copy_to_user(p, &ext, sizeof(ext))) + return -EFAULT; + } +#endif + return err; +} + +static int get_extra_state_from_sigframe(int ia32, void __user *fp, struct sc_ext *ext) +{ + int err = 0; + +#ifdef CONFIG_X86_SHADOW_STACK + struct cet_status *cet = ¤t->thread.cet; + void __user *p; + + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return 0; + + if (!cet->shstk_size) + return 0; + + memset(ext, 0, sizeof(*ext)); + + p = fp; + if (ia32) + p += sizeof(struct fregs_state); + + p += fpu_user_xstate_size + FP_XSTATE_MAGIC2_SIZE; + p = (void __user *)ALIGN((unsigned long)p, 8); + + if (copy_from_user(ext, p, sizeof(*ext))) + return -EFAULT; + + if (ext->total_size != sizeof(*ext)) + return -EFAULT; + + if (cet->shstk_size) + err = shstk_check_rstor_token(ia32, ext->ssp, &ext->ssp); +#endif + return err; +} + +/* + * Called from __fpu__restore_sig() and protected by fpregs_lock(). + */ +static int restore_extra_state_to_xregs(struct sc_ext *sc_ext) +{ + int err = 0; + +#ifdef CONFIG_X86_SHADOW_STACK + struct cet_status *cet = ¤t->thread.cet; + + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return 0; + + if (cet->shstk_size) + err = wrmsrl_safe(MSR_IA32_PL3_SSP, sc_ext->ssp); +#endif + return err; +} + /* * Signal frame handlers. */ @@ -295,6 +396,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size) struct task_struct *tsk = current; struct fpu *fpu = &tsk->thread.fpu; struct user_i387_ia32_struct env; + struct sc_ext sc_ext; u64 user_xfeatures = 0; int fx_only = 0; int ret = 0; @@ -335,6 +437,10 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size) if ((unsigned long)buf_fx % 64) fx_only = 1; + ret = get_extra_state_from_sigframe(ia32_fxstate, buf, &sc_ext); + if (ret) + return ret; + if (!ia32_fxstate) { /* * Attempt to restore the FPU registers directly from user @@ -365,9 +471,17 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size) xfeatures_mask_supervisor()) copy_kernel_to_xregs(&fpu->state.xsave, xfeatures_mask_supervisor()); - fpregs_mark_activate(); + + ret = restore_extra_state_to_xregs(&sc_ext); + if (!ret) { + fpregs_mark_activate(); + } else { + fpregs_deactivate(fpu); + fpu__clear_user_states(fpu); + } + fpregs_unlock(); - return 0; + return ret; } fpregs_unlock(); } else { @@ -430,6 +544,8 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size) ret = copy_kernel_to_xregs_err(&fpu->state.xsave, user_xfeatures | xfeatures_mask_supervisor()); + if (!ret) + ret = restore_extra_state_to_xregs(&sc_ext); } else if (use_fxsr()) { ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size); if (ret) { @@ -491,12 +607,29 @@ int fpu__restore_sig(void __user *buf, int ia32_frame) return __fpu__restore_sig(buf, buf_fx, size); } +static unsigned long fpu__alloc_sigcontext_ext(unsigned long sp) +{ +#ifdef CONFIG_X86_SHADOW_STACK + struct cet_status *cet = ¤t->thread.cet; + + /* + * sigcontext_ext is at: fpu + fpu_user_xstate_size + + * FP_XSTATE_MAGIC2_SIZE, then aligned to 8. + */ + if (cet->shstk_size) + sp -= (sizeof(struct sc_ext) + 8); +#endif + return sp; +} + unsigned long fpu__alloc_mathframe(unsigned long sp, int ia32_frame, unsigned long *buf_fx, unsigned long *size) { unsigned long frame_size = xstate_sigframe_size(); + sp = fpu__alloc_sigcontext_ext(sp); + *buf_fx = sp = round_down(sp - frame_size, 64); if (ia32_frame && use_fxsr()) { frame_size += sizeof(struct fregs_state); diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c index f306e85a08a6..56d2412174c8 100644 --- a/arch/x86/kernel/signal.c +++ b/arch/x86/kernel/signal.c @@ -239,6 +239,9 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size, unsigned long buf_fx = 0; int onsigstack = on_sig_stack(sp); int ret; +#ifdef CONFIG_X86_64 + void __user *restorer = NULL; +#endif /* redzone */ if (IS_ENABLED(CONFIG_X86_64)) @@ -270,6 +273,12 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size, if (onsigstack && !likely(on_sig_stack(sp))) return (void __user *)-1L; +#ifdef CONFIG_X86_64 + if (ka->sa.sa_flags & SA_RESTORER) + restorer = ka->sa.sa_restorer; + ret = save_extra_state_to_sigframe(0, *fpstate, restorer); +#endif + /* save i387 and extended state */ ret = copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size); if (ret < 0) From patchwork Tue Apr 27 20:43:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227275 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E9A2C43616 for ; Tue, 27 Apr 2021 20:45:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0E174613B0 for ; Tue, 27 Apr 2021 20:45:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0E174613B0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9EF776B0096; Tue, 27 Apr 2021 16:44:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 979206B0098; Tue, 27 Apr 2021 16:44:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A54E6B0099; Tue, 27 Apr 2021 16:44:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0167.hostedemail.com [216.40.44.167]) by kanga.kvack.org (Postfix) with ESMTP id 50E6E6B0096 for ; Tue, 27 Apr 2021 16:44:27 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 12EC08249980 for ; Tue, 27 Apr 2021 20:44:27 +0000 (UTC) X-FDA: 78079324974.04.A258320 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf09.hostedemail.com (Postfix) with ESMTP id 9F51E600010B for ; Tue, 27 Apr 2021 20:44:19 +0000 (UTC) IronPort-SDR: 1jYMUV+Aic9/ZO5B+xlcKvL3/0/9Un5d2viLCZS/3S4obaR0WwjpkhJaddR33yXLB++2aG9S1t jVW3SNzU+C8A== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="176706102" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="176706102" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:23 -0700 IronPort-SDR: 8yzCKqRy4Y6e6VOhwrRKcs9kGSozhQV6OQPWeNunGrFhBQ4UjYuLWjg58mnvj0Kl5CCA/t9BWY GQNo+uoBeOOQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623557" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:23 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , Mark Brown , Catalin Marinas Subject: [PATCH v26 26/30] ELF: Introduce arch_setup_elf_property() Date: Tue, 27 Apr 2021 13:43:11 -0700 Message-Id: <20210427204315.24153-27-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Stat-Signature: 63h9o8pb1z4hsjk3p6bqxagxbr5xzohr X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9F51E600010B Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf09; identity=mailfrom; envelope-from=""; helo=mga17.intel.com; client-ip=192.55.52.151 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556259-341758 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: An ELF file's .note.gnu.property indicates arch features supported by the file. These features are extracted by arch_parse_elf_property() and stored in 'arch_elf_state'. Introduce x86 feature definitions and arch_setup_elf_property(), which enables such features. The first use-case of this function is Shadow Stack. ARM64 is the other arch that has ARCH_USE_GNU_PROPERTY and arch_parse_elf_ property(). Add arch_setup_elf_property() for it. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook Cc: Mark Brown Cc: Catalin Marinas Cc: Dave Martin --- v24: - Change cet_setup_shstk() to shstk_setup() to reflect function name changes relating to the splitting of shadow stack and ibt. arch/arm64/include/asm/elf.h | 5 +++++ arch/x86/Kconfig | 2 ++ arch/x86/include/asm/elf.h | 13 +++++++++++++ arch/x86/kernel/process_64.c | 32 ++++++++++++++++++++++++++++++++ fs/binfmt_elf.c | 4 ++++ include/linux/elf.h | 6 ++++++ include/uapi/linux/elf.h | 9 +++++++++ 7 files changed, 71 insertions(+) diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h index 8d1c8dcb87fd..d37bc7915935 100644 --- a/arch/arm64/include/asm/elf.h +++ b/arch/arm64/include/asm/elf.h @@ -281,6 +281,11 @@ static inline int arch_parse_elf_property(u32 type, const void *data, return 0; } +static inline int arch_setup_elf_property(struct arch_elf_state *arch) +{ + return 0; +} + static inline int arch_elf_pt_proc(void *ehdr, void *phdr, struct file *f, bool is_interp, struct arch_elf_state *state) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 41283f82fd87..77d2e44995d7 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1951,6 +1951,8 @@ config X86_SHADOW_STACK depends on AS_WRUSS depends on ARCH_HAS_SHADOW_STACK select ARCH_USES_HIGH_VMA_FLAGS + select ARCH_USE_GNU_PROPERTY + select ARCH_BINFMT_ELF_STATE help Shadow Stack protection is a hardware feature that detects function return address corruption. This helps mitigate ROP attacks. diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h index 9224d40cdefe..6a131047be8a 100644 --- a/arch/x86/include/asm/elf.h +++ b/arch/x86/include/asm/elf.h @@ -390,6 +390,19 @@ extern int compat_arch_setup_additional_pages(struct linux_binprm *bprm, extern bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs); +#ifdef CONFIG_ARCH_BINFMT_ELF_STATE +struct arch_elf_state { + unsigned int gnu_property; +}; + +#define INIT_ARCH_ELF_STATE { \ + .gnu_property = 0, \ +} + +#define arch_elf_pt_proc(ehdr, phdr, elf, interp, state) (0) +#define arch_check_elf(ehdr, interp, interp_ehdr, state) (0) +#endif + /* Do not change the values. See get_align_mask() */ enum align_flags { ALIGN_VA_32 = BIT(0), diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index d08307df69ad..d71045b29475 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -835,3 +835,35 @@ unsigned long KSTK_ESP(struct task_struct *task) { return task_pt_regs(task)->sp; } + +#ifdef CONFIG_ARCH_USE_GNU_PROPERTY +int arch_parse_elf_property(u32 type, const void *data, size_t datasz, + bool compat, struct arch_elf_state *state) +{ + if (type != GNU_PROPERTY_X86_FEATURE_1_AND) + return 0; + + if (datasz != sizeof(unsigned int)) + return -ENOEXEC; + + state->gnu_property = *(unsigned int *)data; + return 0; +} + +int arch_setup_elf_property(struct arch_elf_state *state) +{ + int r = 0; + + if (!IS_ENABLED(CONFIG_X86_SHADOW_STACK)) + return r; + + memset(¤t->thread.cet, 0, sizeof(struct cet_status)); + + if (static_cpu_has(X86_FEATURE_SHSTK)) { + if (state->gnu_property & GNU_PROPERTY_X86_FEATURE_1_SHSTK) + r = shstk_setup(); + } + + return r; +} +#endif diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index b12ba98ae9f5..fa665eceba04 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1248,6 +1248,10 @@ static int load_elf_binary(struct linux_binprm *bprm) set_binfmt(&elf_format); + retval = arch_setup_elf_property(&arch_state); + if (retval < 0) + goto out; + #ifdef ARCH_HAS_SETUP_ADDITIONAL_PAGES retval = ARCH_SETUP_ADDITIONAL_PAGES(bprm, elf_ex, !!interpreter); if (retval < 0) diff --git a/include/linux/elf.h b/include/linux/elf.h index c9a46c4e183b..be04d15e937f 100644 --- a/include/linux/elf.h +++ b/include/linux/elf.h @@ -92,9 +92,15 @@ static inline int arch_parse_elf_property(u32 type, const void *data, { return 0; } + +static inline int arch_setup_elf_property(struct arch_elf_state *arch) +{ + return 0; +} #else extern int arch_parse_elf_property(u32 type, const void *data, size_t datasz, bool compat, struct arch_elf_state *arch); +extern int arch_setup_elf_property(struct arch_elf_state *arch); #endif #ifdef CONFIG_ARCH_HAVE_ELF_PROT diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h index 30f68b42eeb5..24ba55ba8278 100644 --- a/include/uapi/linux/elf.h +++ b/include/uapi/linux/elf.h @@ -455,4 +455,13 @@ typedef struct elf64_note { /* Bits for GNU_PROPERTY_AARCH64_FEATURE_1_BTI */ #define GNU_PROPERTY_AARCH64_FEATURE_1_BTI (1U << 0) +/* .note.gnu.property types for x86: */ +#define GNU_PROPERTY_X86_FEATURE_1_AND 0xc0000002 + +/* Bits for GNU_PROPERTY_X86_FEATURE_1_AND */ +#define GNU_PROPERTY_X86_FEATURE_1_IBT 0x00000001 +#define GNU_PROPERTY_X86_FEATURE_1_SHSTK 0x00000002 +#define GNU_PROPERTY_X86_FEATURE_1_VALID (GNU_PROPERTY_X86_FEATURE_1_IBT | \ + GNU_PROPERTY_X86_FEATURE_1_SHSTK) + #endif /* _UAPI_LINUX_ELF_H */ From patchwork Tue Apr 27 20:43:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227277 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C7BEC433ED for ; Tue, 27 Apr 2021 20:45:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EDF15613BC for ; Tue, 27 Apr 2021 20:45:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EDF15613BC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 24CDC6B0098; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 200176B009B; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ECF5E6B009A; Tue, 27 Apr 2021 16:44:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0162.hostedemail.com [216.40.44.162]) by kanga.kvack.org (Postfix) with ESMTP id C51B76B0099 for ; Tue, 27 Apr 2021 16:44:27 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8786652D0 for ; Tue, 27 Apr 2021 20:44:27 +0000 (UTC) X-FDA: 78079324974.01.50D090F Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf02.hostedemail.com (Postfix) with ESMTP id C9D1B40002CF for ; Tue, 27 Apr 2021 20:43:56 +0000 (UTC) IronPort-SDR: 55ro3XrwWzWL0h/t5gyaddy94DHms3rRGImOr/P+Wj0S1y5M9GyTZWoxGGbEgdHjkyntZHp0ZI abRACnjnOpHw== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="176706105" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="176706105" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:24 -0700 IronPort-SDR: pdhN9DO68Oznb1S0c8QNNqSuQWmuvRdrE/NbCCsHn7EJKwxjY6suJDNgF98m2AOCMFAu7RZFTZ /Bc0iqlVDZLg== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623564" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:23 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu Subject: [PATCH v26 27/30] x86/cet/shstk: Add arch_prctl functions for shadow stack Date: Tue, 27 Apr 2021 13:43:12 -0700 Message-Id: <20210427204315.24153-28-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Stat-Signature: wcn5sg8pbxyruxwokgot8ewzeuuuocq8 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C9D1B40002CF Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf02; identity=mailfrom; envelope-from=""; helo=mga17.intel.com; client-ip=192.55.52.151 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556236-754857 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: arch_prctl(ARCH_X86_CET_STATUS, u64 *args) Get CET feature status. The parameter 'args' is a pointer to a user buffer. The kernel returns the following information: *args = shadow stack/IBT status *(args + 1) = shadow stack base address *(args + 2) = shadow stack size 32-bit binaries use the same interface, but only lower 32-bits of each item. arch_prctl(ARCH_X86_CET_DISABLE, unsigned int features) Disable CET features specified in 'features'. Return -EPERM if CET is locked. arch_prctl(ARCH_X86_CET_LOCK) Lock in CET features. Also change do_arch_prctl_common()'s parameter 'cpuid_enabled' to 'arg2', as it is now also passed to prctl_cet(). Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook --- v25: - Change CONFIG_X86_CET to CONFIG_X86_SHADOW_STACK. - Change X86_FEATURE_CET to X86_FEATURE_SHSTK. v24: - Update #ifdef placement relating to shadow stack and ibt split. - Update function names. arch/x86/include/asm/cet.h | 7 ++++ arch/x86/include/uapi/asm/prctl.h | 4 +++ arch/x86/kernel/Makefile | 2 +- arch/x86/kernel/cet_prctl.c | 60 +++++++++++++++++++++++++++++++ arch/x86/kernel/process.c | 6 ++-- 5 files changed, 75 insertions(+), 4 deletions(-) create mode 100644 arch/x86/kernel/cet_prctl.c diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h index 5e66919bd2fe..662335ceb57f 100644 --- a/arch/x86/include/asm/cet.h +++ b/arch/x86/include/asm/cet.h @@ -14,6 +14,7 @@ struct sc_ext; struct cet_status { unsigned long shstk_base; unsigned long shstk_size; + unsigned int locked:1; }; #ifdef CONFIG_X86_SHADOW_STACK @@ -40,6 +41,12 @@ static inline int shstk_check_rstor_token(bool ia32, unsigned long token_addr, unsigned long *new_ssp) { return 0; } #endif +#ifdef CONFIG_X86_SHADOW_STACK +int prctl_cet(int option, u64 arg2); +#else +static inline int prctl_cet(int option, u64 arg2) { return -EINVAL; } +#endif + #endif /* __ASSEMBLY__ */ #endif /* _ASM_X86_CET_H */ diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index 5a6aac9fa41f..9245bf629120 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -14,4 +14,8 @@ #define ARCH_MAP_VDSO_32 0x2002 #define ARCH_MAP_VDSO_64 0x2003 +#define ARCH_X86_CET_STATUS 0x3001 +#define ARCH_X86_CET_DISABLE 0x3002 +#define ARCH_X86_CET_LOCK 0x3003 + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 0f99b093f350..eb13d578ad36 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -150,7 +150,7 @@ obj-$(CONFIG_UNWINDER_FRAME_POINTER) += unwind_frame.o obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev-es.o -obj-$(CONFIG_X86_SHADOW_STACK) += shstk.o +obj-$(CONFIG_X86_SHADOW_STACK) += shstk.o cet_prctl.o ### # 64 bit specific files diff --git a/arch/x86/kernel/cet_prctl.c b/arch/x86/kernel/cet_prctl.c new file mode 100644 index 000000000000..3bb9f32ca70d --- /dev/null +++ b/arch/x86/kernel/cet_prctl.c @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* See Documentation/x86/intel_cet.rst. */ + +static int cet_copy_status_to_user(struct cet_status *cet, u64 __user *ubuf) +{ + u64 buf[3] = {}; + + if (cet->shstk_size) { + buf[0] |= GNU_PROPERTY_X86_FEATURE_1_SHSTK; + buf[1] = cet->shstk_base; + buf[2] = cet->shstk_size; + } + + return copy_to_user(ubuf, buf, sizeof(buf)); +} + +int prctl_cet(int option, u64 arg2) +{ + struct cet_status *cet; + + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) + return -ENOTSUPP; + + cet = ¤t->thread.cet; + + if (option == ARCH_X86_CET_STATUS) + return cet_copy_status_to_user(cet, (u64 __user *)arg2); + + switch (option) { + case ARCH_X86_CET_DISABLE: + if (cet->locked) + return -EPERM; + + if (arg2 & ~GNU_PROPERTY_X86_FEATURE_1_VALID) + return -EINVAL; + if (arg2 & GNU_PROPERTY_X86_FEATURE_1_SHSTK) + shstk_disable(); + return 0; + + case ARCH_X86_CET_LOCK: + if (arg2) + return -EINVAL; + cet->locked = 1; + return 0; + + default: + return -ENOSYS; + } +} diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index fa01e8679d01..315668a334fd 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -980,14 +980,14 @@ unsigned long get_wchan(struct task_struct *p) } long do_arch_prctl_common(struct task_struct *task, int option, - unsigned long cpuid_enabled) + unsigned long arg2) { switch (option) { case ARCH_GET_CPUID: return get_cpuid_mode(); case ARCH_SET_CPUID: - return set_cpuid_mode(task, cpuid_enabled); + return set_cpuid_mode(task, arg2); } - return -EINVAL; + return prctl_cet(option, arg2); } From patchwork Tue Apr 27 20:43:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227279 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2482CC43462 for ; Tue, 27 Apr 2021 20:45:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D177F613E5 for ; Tue, 27 Apr 2021 20:45:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D177F613E5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9A3A16B0099; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B5D06B009C; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6913E6B009B; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0001.hostedemail.com [216.40.44.1]) by kanga.kvack.org (Postfix) with ESMTP id 1C8736B0099 for ; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D26B3180AD81A for ; Tue, 27 Apr 2021 20:44:27 +0000 (UTC) X-FDA: 78079324974.23.62D1913 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf09.hostedemail.com (Postfix) with ESMTP id 841E1600010B for ; Tue, 27 Apr 2021 20:44:20 +0000 (UTC) IronPort-SDR: wxMFCIXCdryAY15mJwlrn7YIjQGTRIqfFHsaLO5ig1rZtLItkqWCQOr6CmpNKbHV+jHitNlrNy E+Uebk/J3BIA== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="176706108" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="176706108" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:25 -0700 IronPort-SDR: sLuunklONRNhbhx66Q9Qc7lgCDoAmqA7cYNZMlv2zMALUcfNaX7QdmhjtfQKnAMg6YsnnvnqIJ yMa/7UyNmEyA== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623570" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:24 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 28/30] mm: Move arch_calc_vm_prot_bits() to arch/x86/include/asm/mman.h Date: Tue, 27 Apr 2021 13:43:13 -0700 Message-Id: <20210427204315.24153-29-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Stat-Signature: 1bkfqftj61fhz3g6i7echayr85i4mrjo X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 841E1600010B Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf09; identity=mailfrom; envelope-from=""; helo=mga17.intel.com; client-ip=192.55.52.151 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556260-875780 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: To prepare the introduction of PROT_SHADOW_STACK and be consistent with other architectures, move arch_vm_get_page_prot() and arch_calc_vm_prot_bits() to arch/x86/include/asm/mman.h. Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook Reviewed-by: Kirill A. Shutemov --- arch/x86/include/asm/mman.h | 30 ++++++++++++++++++++++++++++++ arch/x86/include/uapi/asm/mman.h | 28 +++------------------------- 2 files changed, 33 insertions(+), 25 deletions(-) create mode 100644 arch/x86/include/asm/mman.h diff --git a/arch/x86/include/asm/mman.h b/arch/x86/include/asm/mman.h new file mode 100644 index 000000000000..629f6c81263a --- /dev/null +++ b/arch/x86/include/asm/mman.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_MMAN_H +#define _ASM_X86_MMAN_H + +#include +#include + +#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS +/* + * Take the 4 protection key bits out of the vma->vm_flags + * value and turn them in to the bits that we can put in + * to a pte. + * + * Only override these if Protection Keys are available + * (which is only on 64-bit). + */ +#define arch_vm_get_page_prot(vm_flags) __pgprot( \ + ((vm_flags) & VM_PKEY_BIT0 ? _PAGE_PKEY_BIT0 : 0) | \ + ((vm_flags) & VM_PKEY_BIT1 ? _PAGE_PKEY_BIT1 : 0) | \ + ((vm_flags) & VM_PKEY_BIT2 ? _PAGE_PKEY_BIT2 : 0) | \ + ((vm_flags) & VM_PKEY_BIT3 ? _PAGE_PKEY_BIT3 : 0)) + +#define arch_calc_vm_prot_bits(prot, key) ( \ + ((key) & 0x1 ? VM_PKEY_BIT0 : 0) | \ + ((key) & 0x2 ? VM_PKEY_BIT1 : 0) | \ + ((key) & 0x4 ? VM_PKEY_BIT2 : 0) | \ + ((key) & 0x8 ? VM_PKEY_BIT3 : 0)) +#endif + +#endif /* _ASM_X86_MMAN_H */ diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h index d4a8d0424bfb..f28fa4acaeaf 100644 --- a/arch/x86/include/uapi/asm/mman.h +++ b/arch/x86/include/uapi/asm/mman.h @@ -1,31 +1,9 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ -#ifndef _ASM_X86_MMAN_H -#define _ASM_X86_MMAN_H +#ifndef _UAPI_ASM_X86_MMAN_H +#define _UAPI_ASM_X86_MMAN_H #define MAP_32BIT 0x40 /* only give out 32bit addresses */ -#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS -/* - * Take the 4 protection key bits out of the vma->vm_flags - * value and turn them in to the bits that we can put in - * to a pte. - * - * Only override these if Protection Keys are available - * (which is only on 64-bit). - */ -#define arch_vm_get_page_prot(vm_flags) __pgprot( \ - ((vm_flags) & VM_PKEY_BIT0 ? _PAGE_PKEY_BIT0 : 0) | \ - ((vm_flags) & VM_PKEY_BIT1 ? _PAGE_PKEY_BIT1 : 0) | \ - ((vm_flags) & VM_PKEY_BIT2 ? _PAGE_PKEY_BIT2 : 0) | \ - ((vm_flags) & VM_PKEY_BIT3 ? _PAGE_PKEY_BIT3 : 0)) - -#define arch_calc_vm_prot_bits(prot, key) ( \ - ((key) & 0x1 ? VM_PKEY_BIT0 : 0) | \ - ((key) & 0x2 ? VM_PKEY_BIT1 : 0) | \ - ((key) & 0x4 ? VM_PKEY_BIT2 : 0) | \ - ((key) & 0x8 ? VM_PKEY_BIT3 : 0)) -#endif - #include -#endif /* _ASM_X86_MMAN_H */ +#endif /* _UAPI_ASM_X86_MMAN_H */ From patchwork Tue Apr 27 20:43:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227283 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00C3EC43461 for ; Tue, 27 Apr 2021 20:45:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A402E60720 for ; Tue, 27 Apr 2021 20:45:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A402E60720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 29D476B009B; Tue, 27 Apr 2021 16:44:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 142246B009E; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE9906B009D; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id 955FE6B009B for ; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 55B95181AEF15 for ; Tue, 27 Apr 2021 20:44:28 +0000 (UTC) X-FDA: 78079325016.20.6678391 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf26.hostedemail.com (Postfix) with ESMTP id 29E5E40002C0 for ; Tue, 27 Apr 2021 20:44:18 +0000 (UTC) IronPort-SDR: 47vTpwTw/mXr2G2FyRxa8IzKp9XIb52oHWXXyY8eUUxRqhGZBK/gbCKbXmsRwdtwYHtZ0wFeN2 mfiMV6HVJINQ== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="176706111" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="176706111" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:26 -0700 IronPort-SDR: xtJHpIS9Akgk3yrr6I4AH9oYnMCacZLvBt3pMls0JqIaGOZgvywvKXp62L6pyHD+s4IAixmFBu 6gDaLjPlgRdQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623573" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:25 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , Catalin Marinas , "Kirill A . Shutemov" , Vincenzo Frascino , Will Deacon Subject: [PATCH v26 29/30] mm: Update arch_validate_flags() to test vma anonymous Date: Tue, 27 Apr 2021 13:43:14 -0700 Message-Id: <20210427204315.24153-30-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 29E5E40002C0 X-Stat-Signature: zr7srdegomd84uz9w1z5ga3rirjj3x8w Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf26; identity=mailfrom; envelope-from=""; helo=mga17.intel.com; client-ip=192.55.52.151 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556258-360284 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When newer VM flags are being created, such as VM_MTE, it becomes necessary for mmap/mprotect to verify if certain flags are being applied to an anonymous VMA. To solve this, one approach is adding a VM flag to track that MAP_ANONYMOUS is specified [1], and then using the flag in arch_validate_flags(). Another approach is passing the VMA to arch_validate_flags(), and check vma_is_anonymous(). To prepare the introduction of PROT_SHADOW_STACK, which creates a shadow stack mapping and can be applied only to an anonymous VMA, update arch_validate_flags() to pass in the VMA. [1] commit 9f3419315f3c ("arm64: mte: Add PROT_MTE support to mmap() and mprotect()"), Signed-off-by: Yu-cheng Yu Cc: Catalin Marinas Cc: Kees Cook Cc: Kirill A. Shutemov Cc: Vincenzo Frascino Cc: Will Deacon Reviewed-by: Kirill A. Shutemov --- v26: - Instead of passing vma is anonymous, pass the VMA to arch_validate_flags(). arch/arm64/include/asm/mman.h | 4 ++-- arch/sparc/include/asm/mman.h | 4 ++-- include/linux/mman.h | 2 +- mm/mmap.c | 2 +- mm/mprotect.c | 2 +- 5 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h index e3e28f7daf62..7c45e7578f78 100644 --- a/arch/arm64/include/asm/mman.h +++ b/arch/arm64/include/asm/mman.h @@ -74,7 +74,7 @@ static inline bool arch_validate_prot(unsigned long prot, } #define arch_validate_prot(prot, addr) arch_validate_prot(prot, addr) -static inline bool arch_validate_flags(unsigned long vm_flags) +static inline bool arch_validate_flags(struct vm_area_struct *vma, unsigned long vm_flags) { if (!system_supports_mte()) return true; @@ -82,6 +82,6 @@ static inline bool arch_validate_flags(unsigned long vm_flags) /* only allow VM_MTE if VM_MTE_ALLOWED has been set previously */ return !(vm_flags & VM_MTE) || (vm_flags & VM_MTE_ALLOWED); } -#define arch_validate_flags(vm_flags) arch_validate_flags(vm_flags) +#define arch_validate_flags(vma, vm_flags) arch_validate_flags(vma, vm_flags) #endif /* ! __ASM_MMAN_H__ */ diff --git a/arch/sparc/include/asm/mman.h b/arch/sparc/include/asm/mman.h index 274217e7ed70..0ec4975f167d 100644 --- a/arch/sparc/include/asm/mman.h +++ b/arch/sparc/include/asm/mman.h @@ -60,11 +60,11 @@ static inline int sparc_validate_prot(unsigned long prot, unsigned long addr) return 1; } -#define arch_validate_flags(vm_flags) arch_validate_flags(vm_flags) +#define arch_validate_flags(vma, vm_flags) arch_validate_flags(vma, vm_flags) /* arch_validate_flags() - Ensure combination of flags is valid for a * VMA. */ -static inline bool arch_validate_flags(unsigned long vm_flags) +static inline bool arch_validate_flags(struct vm_area_struct *vma, unsigned long vm_flags) { /* If ADI is being enabled on this VMA, check for ADI * capability on the platform and ensure VMA is suitable diff --git a/include/linux/mman.h b/include/linux/mman.h index 629cefc4ecba..41d6fbf4e7d9 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -114,7 +114,7 @@ static inline bool arch_validate_prot(unsigned long prot, unsigned long addr) * * Returns true if the VM_* flags are valid. */ -static inline bool arch_validate_flags(unsigned long flags) +static inline bool arch_validate_flags(struct vm_area_struct *vma, unsigned long flags) { return true; } diff --git a/mm/mmap.c b/mm/mmap.c index 7b2992ef8ee0..b6a2b44b75dc 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1850,7 +1850,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, } /* Allow architectures to sanity-check the vm_flags */ - if (!arch_validate_flags(vma->vm_flags)) { + if (!arch_validate_flags(vma, vma->vm_flags)) { error = -EINVAL; if (file) goto unmap_and_free_vma; diff --git a/mm/mprotect.c b/mm/mprotect.c index 3b2f0d75519f..7aef1e1af11a 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -611,7 +611,7 @@ static int do_mprotect_pkey(unsigned long start, size_t len, } /* Allow architectures to sanity-check the new flags */ - if (!arch_validate_flags(newflags)) { + if (!arch_validate_flags(vma, newflags)) { error = -EINVAL; goto out; } From patchwork Tue Apr 27 20:43:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu-cheng Yu X-Patchwork-Id: 12227281 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3BD8C43460 for ; Tue, 27 Apr 2021 20:45:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AA2F8613B0 for ; Tue, 27 Apr 2021 20:45:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA2F8613B0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E35636B009A; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DE7996B009C; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ABC216B009A; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0132.hostedemail.com [216.40.44.132]) by kanga.kvack.org (Postfix) with ESMTP id 82B3B6B009A for ; Tue, 27 Apr 2021 16:44:28 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5108552C5 for ; Tue, 27 Apr 2021 20:44:28 +0000 (UTC) X-FDA: 78079325016.01.D0A3559 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf02.hostedemail.com (Postfix) with ESMTP id 9DBBA40002C8 for ; Tue, 27 Apr 2021 20:43:57 +0000 (UTC) IronPort-SDR: DN3DWM9J9xs5n1u842dY+92jBCpQXGWRgqZmuh4Gnvo3y42e1zSKMiphZwX2coUTd9y3CNWc1h UJ7Mqq3Vg8FQ== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="176706115" X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="176706115" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:26 -0700 IronPort-SDR: xWB2H/oAI2gkiea5eIHtXHrJcmmmzHjfnM/w0JGR3fwufgi1UErfUQeXDwBL4TzugG+6o0qqyj zJ4vZYmADzSQ== X-IronPort-AV: E=Sophos;i="5.82,255,1613462400"; d="scan'208";a="465623580" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2021 13:44:25 -0700 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang Cc: Yu-cheng Yu , "Kirill A . Shutemov" Subject: [PATCH v26 30/30] mm: Introduce PROT_SHADOW_STACK for shadow stack Date: Tue, 27 Apr 2021 13:43:15 -0700 Message-Id: <20210427204315.24153-31-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210427204315.24153-1-yu-cheng.yu@intel.com> References: <20210427204315.24153-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 X-Stat-Signature: 9i3h48fizg5mwy4shz6x61pw8y5o8h3z X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9DBBA40002C8 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf02; identity=mailfrom; envelope-from=""; helo=mga17.intel.com; client-ip=192.55.52.151 X-HE-DKIM-Result: none/none X-HE-Tag: 1619556237-33233 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are three possible options to create a shadow stack allocation API: an arch_prctl, a new syscall, or adding PROT_SHADOW_STACK to mmap() and mprotect(). Each has its advantages and compromises. An arch_prctl() is the least intrusive. However, the existing x86 arch_prctl() takes only two parameters. Multiple parameters must be passed in a memory buffer. There is a proposal to pass more parameters in registers [1], but no active discussion on that. A new syscall minimizes compatibility issues and offers an extensible frame work to other architectures, but this will likely result in some overlap of mmap()/mprotect(). The introduction of PROT_SHADOW_STACK to mmap()/mprotect() takes advantage of existing APIs. The x86-specific PROT_SHADOW_STACK is translated to VM_SHADOW_STACK and a shadow stack mapping is created without reinventing the wheel. There are potential pitfalls though. The most obvious one would be using this as a bypass to shadow stack protection. However, the attacker would have to get to the syscall first. [1] https://lore.kernel.org/lkml/20200828121624.108243-1-hjl.tools@gmail.com/ Signed-off-by: Yu-cheng Yu Cc: Kees Cook Cc: Kirill A. Shutemov Reviewed-by: Kirill A. Shutemov --- v26: - Change PROT_SHSTK to PROT_SHADOW_STACK. - Remove (vm_flags & VM_SHARED) check, since it is covered by !vma_is_anonymous(). v24: - Update arch_calc_vm_prot_bits(), leave PROT* checking to arch_validate_prot(). - Update arch_validate_prot(), leave vma flags checking to arch_validate_flags(). - Add arch_validate_flags(). arch/x86/include/asm/mman.h | 60 +++++++++++++++++++++++++++++++- arch/x86/include/uapi/asm/mman.h | 2 ++ include/linux/mm.h | 1 + 3 files changed, 62 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/mman.h b/arch/x86/include/asm/mman.h index 629f6c81263a..fbb90f1b02c0 100644 --- a/arch/x86/include/asm/mman.h +++ b/arch/x86/include/asm/mman.h @@ -20,11 +20,69 @@ ((vm_flags) & VM_PKEY_BIT2 ? _PAGE_PKEY_BIT2 : 0) | \ ((vm_flags) & VM_PKEY_BIT3 ? _PAGE_PKEY_BIT3 : 0)) -#define arch_calc_vm_prot_bits(prot, key) ( \ +#define pkey_vm_prot_bits(prot, key) ( \ ((key) & 0x1 ? VM_PKEY_BIT0 : 0) | \ ((key) & 0x2 ? VM_PKEY_BIT1 : 0) | \ ((key) & 0x4 ? VM_PKEY_BIT2 : 0) | \ ((key) & 0x8 ? VM_PKEY_BIT3 : 0)) +#else +#define pkey_vm_prot_bits(prot, key) (0) #endif +static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot, + unsigned long pkey) +{ + unsigned long vm_prot_bits = pkey_vm_prot_bits(prot, pkey); + + if (prot & PROT_SHADOW_STACK) + vm_prot_bits |= VM_SHADOW_STACK; + + return vm_prot_bits; +} + +#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey) + +#ifdef CONFIG_X86_SHADOW_STACK +static inline bool arch_validate_prot(unsigned long prot, unsigned long addr) +{ + unsigned long valid = PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | + PROT_SHADOW_STACK; + + if (prot & ~valid) + return false; + + if (prot & PROT_SHADOW_STACK) { + if (!current->thread.cet.shstk_size) + return false; + + /* + * A shadow stack mapping is indirectly writable by only + * the CALL and WRUSS instructions, but not other write + * instructions). PROT_SHADOW_STACK and PROT_WRITE are + * mutually exclusive. + */ + if (prot & PROT_WRITE) + return false; + } + + return true; +} + +#define arch_validate_prot arch_validate_prot + +static inline bool arch_validate_flags(struct vm_area_struct *vma, unsigned long vm_flags) +{ + /* + * Shadow stack must be anonymous and not shared. + */ + if ((vm_flags & VM_SHADOW_STACK) && !vma_is_anonymous(vma)) + return false; + + return true; +} + +#define arch_validate_flags(vma, vm_flags) arch_validate_flags(vma, vm_flags) + +#endif /* CONFIG_X86_SHADOW_STACK */ + #endif /* _ASM_X86_MMAN_H */ diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h index f28fa4acaeaf..4c36b263cf0a 100644 --- a/arch/x86/include/uapi/asm/mman.h +++ b/arch/x86/include/uapi/asm/mman.h @@ -4,6 +4,8 @@ #define MAP_32BIT 0x40 /* only give out 32bit addresses */ +#define PROT_SHADOW_STACK 0x10 /* shadow stack pages */ + #include #endif /* _UAPI_ASM_X86_MMAN_H */ diff --git a/include/linux/mm.h b/include/linux/mm.h index 1ccec5cc399b..9a7652eea207 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -342,6 +342,7 @@ extern unsigned int kobjsize(const void *objp); #if defined(CONFIG_X86) # define VM_PAT VM_ARCH_1 /* PAT reserves whole VMA at once (x86) */ +# define VM_ARCH_CLEAR VM_SHADOW_STACK #elif defined(CONFIG_PPC) # define VM_SAO VM_ARCH_1 /* Strong Access Ordering (powerpc) */ #elif defined(CONFIG_PARISC)