From patchwork Tue Jun 14 12:02:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880958 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9265CC43334 for ; Tue, 14 Jun 2022 12:02:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BBF9D8D023E; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B47258D023D; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E9078D023E; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8BBE38D023C for ; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 57FE26191D for ; Tue, 14 Jun 2022 12:02:40 +0000 (UTC) X-FDA: 79576704480.10.9918F53 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf22.hostedemail.com (Postfix) with ESMTP id E5949C00B4 for ; Tue, 14 Jun 2022 12:02:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208159; x=1686744159; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Gepnq+G3FqhcuUO0CXCxB2ziSQE4OecO4NsW7OsOcao=; b=nR+wanYqZhv5VIsI4gLeKrheYEDhChB+cn7CnrUzsPtM+nBDVBWljVaa jy21cHiVylNDymxXIAHYMZwO9dOMHL6wpS3sRVEsQQAbqBQZkpvRImhEi SfFhPtsHzJEFtRrSwOSQo7iOXs6pEogcUeG3VnvglDho9i9uu5DkrzUTG xtJolo9v6e9kAVQyYp70SQB6FBaHZXqAR/QGt6VlTAKlH+2+SCFr5iTpI hdfyJZKcqnrpOp8QN0y4E16A1VQFnboM1CTL5S5ObAl152QK4HVbIROtC MtfC9qYo/d+LhIA5UoNDDUkFEiekJcES1nH/QC+gc6J2xdAcxTHyJRm4J Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="267281644" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="267281644" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="712440884" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga004.jf.intel.com with ESMTP; 14 Jun 2022 05:02:28 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 825B118F; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Mike Rapoport Subject: [PATCHv7 01/14] x86/boot: Centralize __pa()/__va() definitions Date: Tue, 14 Jun 2022 15:02:18 +0300 Message-Id: <20220614120231.48165-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208159; a=rsa-sha256; cv=none; b=s51K9/cz+04twDxpABYjl1CC1AQw/rmHu374BYvn+ymiQfF7NN5aUjBHc0UXikwVZkfKSM OzlifXhDtUS81M8iGAOvvT8QbjXVGTX6qj9tVuT1/A3aOI1q4vu/ygSXna5HZzAfFeq20o kD+jZK6pMHl25KYos3/aWBRhUWnPaKg= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=nR+wanYq; spf=none (imf22.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208159; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pvd0q95uJCSzrUcSPspBy/3xWQeyzSLMNTzhkSvL6Qo=; b=x7GT/GzO7IszJk1/CguRCGVhvZ7EaCvJCfhudFc6OabnubJpEqCiR0JZLatzs53f5+u0OL 0RCHV/W2PMeZfNlMp09ZKsZcGvUAEf9CQr5F2hmaG0lqMxCZP8fNvZ0QHOYOqAY0+E6RwT aIcRmA5kgmdM9mOmR8YwUGTwN2H5e2E= X-Stat-Signature: d3n84fnbwqo88g6rweorrzmycbbomact X-Rspamd-Queue-Id: E5949C00B4 Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=nR+wanYq; spf=none (imf22.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1655208158-619740 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Replace multiple __pa()/__va() definitions with a single one in misc.h. Signed-off-by: Kirill A. Shutemov Reviewed-by: David Hildenbrand Reviewed-by: Mike Rapoport Reviewed-by: Dave Hansen --- arch/x86/boot/compressed/ident_map_64.c | 8 -------- arch/x86/boot/compressed/misc.h | 9 +++++++++ arch/x86/boot/compressed/sev.c | 2 -- 3 files changed, 9 insertions(+), 10 deletions(-) diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c index 44c350d627c7..66a2e992c5f5 100644 --- a/arch/x86/boot/compressed/ident_map_64.c +++ b/arch/x86/boot/compressed/ident_map_64.c @@ -8,14 +8,6 @@ * Copyright (C) 2016 Kees Cook */ -/* - * Since we're dealing with identity mappings, physical and virtual - * addresses are the same, so override these defines which are ultimately - * used by the headers in misc.h. - */ -#define __pa(x) ((unsigned long)(x)) -#define __va(x) ((void *)((unsigned long)(x))) - /* No PAGE_TABLE_ISOLATION support needed either: */ #undef CONFIG_PAGE_TABLE_ISOLATION diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index 4910bf230d7b..245cf8f2a0bd 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -19,6 +19,15 @@ /* cpu_feature_enabled() cannot be used this early */ #define USE_EARLY_PGTABLE_L5 +/* + * Boot stub deals with identity mappings, physical and virtual addresses are + * the same, so override these defines. + * + * will not define them if they are already defined. + */ +#define __pa(x) ((unsigned long)(x)) +#define __va(x) ((void *)((unsigned long)(x))) + #include #include #include diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index 52f989f6acc2..730c4677e9db 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -104,9 +104,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt, } #undef __init -#undef __pa #define __init -#define __pa(x) ((unsigned long)(x)) #define __BOOT_COMPRESSED From patchwork Tue Jun 14 12:02:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880960 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37BF6CCA480 for ; Tue, 14 Jun 2022 12:02:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C5E2B8D023F; Tue, 14 Jun 2022 08:02:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BBC908D023D; Tue, 14 Jun 2022 08:02:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A35BC8D023F; Tue, 14 Jun 2022 08:02:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 954E78D023D for ; Tue, 14 Jun 2022 08:02:41 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id 46E9760A5D for ; Tue, 14 Jun 2022 12:02:41 +0000 (UTC) X-FDA: 79576704522.20.97E4F37 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf26.hostedemail.com (Postfix) with ESMTP id 4977B1400AC for ; Tue, 14 Jun 2022 12:02:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208157; x=1686744157; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9Eut4IV6xmO4iZzDBiLqTTFy9q7BW5oKpCzB2UrofAU=; b=jcz0paQbMEqe37P9iuUZPweiVUY7OarZryM82aRqtM0Y3MKufUqgTfrJ rT5XgyFyEswhU6TialLul4sxYSVctB8WXEet6ctY/Ad5K8BxzPS5ldcCq Y19kprPRRwWMPTrJO/VXjX3l90EVBlSXdL0fWr+bNhE1A1NvE8IRU2JAL gVHnE1beYRY1v+fM5zN6QvAbsyKGZsx1enZKd+i7oSfO2lXhNHdWQF5De 2hQFfNCZt8/+4gtC8RH5kw0m20dBJPdWF+RoewC6wndosfmnxo/4g3tQU /x5JKv666WGdJ31s6V22Gb5XaKFwiC8i/PIAZWrohn9IYgMH1+ZGqDL7O A==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="277380183" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="277380183" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="726791038" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga001.fm.intel.com with ESMTP; 14 Jun 2022 05:02:28 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 8CDD5506; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Mike Rapoport Subject: [PATCHv7 02/14] mm: Add support for unaccepted memory Date: Tue, 14 Jun 2022 15:02:19 +0300 Message-Id: <20220614120231.48165-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208158; a=rsa-sha256; cv=none; b=7qOxRVqcVzM6AO9KwmXg1Y8NKpA2bL4LfVkXDoVQmddQm57NCVEFuZzN/ggNgyRnfxclRQ e1rlsYIyY8lvfeec/j9TmwCDZHBQMDEBjqd1HHPWYyTFss9YMBrYGxwhdTxyzILJR/RcDG IhAS7CLGDr+qpAciIpFrs32+jp/4ues= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=jcz0paQb; spf=none (imf26.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208158; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wymVQ3vl7pOv0Q6HyQs8gVcGqrGIlkKZx2YhfwkCzSA=; b=AiIQzCT20/77exAcvtJezl/8Fyt3aGKIPJmb7BIk14annguv/+AMpxXfk+BaQsh3dtNnO/ YA+zYEaKXoixVxTyEv8JgeOtgEQtzt/MG307X5KxY3wJ7WRYDV032AWxYGmMD5WDLyylg3 3gMezEJ3ApvtNRoJvMHDOQc3RX5cJIg= X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 4977B1400AC X-Stat-Signature: akbqn5s8hbjbidosz1u78baz7zwgwd7t Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=jcz0paQb; spf=none (imf26.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com X-HE-Tag: 1655208157-581492 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: UEFI Specification version 2.9 introduces the concept of memory acceptance. Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, require memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific to the Virtual Machine platform. There are several ways kernel can deal with unaccepted memory: 1. Accept all the memory during the boot. It is easy to implement and it doesn't have runtime cost once the system is booted. The downside is very long boot time. Accept can be parallelized to multiple CPUs to keep it manageable (i.e. via DEFERRED_STRUCT_PAGE_INIT), but it tends to saturate memory bandwidth and does not scale beyond the point. 2. Accept a block of memory on the first use. It requires more infrastructure and changes in page allocator to make it work, but it provides good boot time. On-demand memory accept means latency spikes every time kernel steps onto a new memory block. The spikes will go away once workload data set size gets stabilized or all memory gets accepted. 3. Accept all memory in background. Introduce a thread (or multiple) that gets memory accepted proactively. It will minimize time the system experience latency spikes on memory allocation while keeping low boot time. This approach cannot function on its own. It is an extension of #2: background memory acceptance requires functional scheduler, but the page allocator may need to tap into unaccepted memory before that. The downside of the approach is that these threads also steal CPU cycles and memory bandwidth from the user's workload and may hurt user experience. Implement #2 for now. It is a reasonable default. Some workloads may want to use #1 or #3 and they can be implemented later based on user's demands. Support of unaccepted memory requires a few changes in core-mm code: - memblock has to accept memory on allocation; - page allocator has to accept memory on the first allocation of the page; Memblock change is trivial. The page allocator is modified to accept pages on the first allocation. The new page type (encoded in the _mapcount) -- PageUnaccepted() -- is used to indicate that the page requires acceptance. Architecture has to provide two helpers if it wants to support unaccepted memory: - accept_memory() makes a range of physical addresses accepted. - range_contains_unaccepted_memory() checks anything within the range of physical addresses requires acceptance. Signed-off-by: Kirill A. Shutemov Acked-by: Mike Rapoport # memblock Reviewed-by: David Hildenbrand Acked-by: Pankaj Gupta --- include/linux/page-flags.h | 31 +++++++++++++ mm/internal.h | 12 +++++ mm/memblock.c | 9 ++++ mm/page_alloc.c | 89 +++++++++++++++++++++++++++++++++++++- 4 files changed, 139 insertions(+), 2 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index e66f7aa3191d..444ba8f4bfa0 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -942,6 +942,14 @@ static inline bool is_page_hwpoison(struct page *page) #define PG_offline 0x00000100 #define PG_table 0x00000200 #define PG_guard 0x00000400 +#define PG_unaccepted 0x00000800 + +/* + * Page types allowed at page_expected_state() + * + * PageUnaccepted() will get cleared in post_alloc_hook(). + */ +#define PAGE_TYPES_EXPECTED PG_unaccepted #define PageType(page, flag) \ ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE) @@ -967,6 +975,18 @@ static __always_inline void __ClearPage##uname(struct page *page) \ page->page_type |= PG_##lname; \ } +#define PAGE_TYPE_OPS_FALSE(uname) \ +static __always_inline int Page##uname(struct page *page) \ +{ \ + return false; \ +} \ +static __always_inline void __SetPage##uname(struct page *page) \ +{ \ +} \ +static __always_inline void __ClearPage##uname(struct page *page) \ +{ \ +} + /* * PageBuddy() indicates that the page is free and in the buddy system * (see mm/page_alloc.c). @@ -997,6 +1017,17 @@ PAGE_TYPE_OPS(Buddy, buddy) */ PAGE_TYPE_OPS(Offline, offline) +/* + * PageUnaccepted() indicates that the page has to be "accepted" before it can + * be read or written. The page allocator must call accept_page() before + * touching the page or returning it to the caller. + */ +#ifdef CONFIG_UNACCEPTED_MEMORY +PAGE_TYPE_OPS(Unaccepted, unaccepted) +#else +PAGE_TYPE_OPS_FALSE(Unaccepted) +#endif + extern void page_offline_freeze(void); extern void page_offline_thaw(void); extern void page_offline_begin(void); diff --git a/mm/internal.h b/mm/internal.h index c0f8fbe0445b..7c1a653edfc8 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -861,4 +861,16 @@ struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags); DECLARE_PER_CPU(struct per_cpu_nodestat, boot_nodestats); +#ifndef CONFIG_UNACCEPTED_MEMORY +static inline bool range_contains_unaccepted_memory(phys_addr_t start, + phys_addr_t end) +{ + return false; +} + +static inline void accept_memory(phys_addr_t start, phys_addr_t end) +{ +} +#endif + #endif /* __MM_INTERNAL_H */ diff --git a/mm/memblock.c b/mm/memblock.c index e4f03a6e8e56..a1f7f8b304d5 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1405,6 +1405,15 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, */ kmemleak_alloc_phys(found, size, 0, 0); + /* + * Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, + * require memory to be accepted before it can be used by the + * guest. + * + * Accept the memory of the allocated buffer. + */ + accept_memory(found, found + size); + return found; } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e008a3df0485..279c2746aaa8 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -122,6 +122,12 @@ typedef int __bitwise fpi_t; */ #define FPI_SKIP_KASAN_POISON ((__force fpi_t)BIT(2)) +/* + * Check if the page needs to be marked as PageUnaccepted(). + * Used for the new pages added to the buddy allocator for the first time. + */ +#define FPI_UNACCEPTED_SLOWPATH ((__force fpi_t)BIT(3)) + /* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */ static DEFINE_MUTEX(pcp_batch_high_lock); #define MIN_PERCPU_PAGELIST_HIGH_FRACTION (8) @@ -993,6 +999,35 @@ buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn, NULL) != NULL; } +/* + * Page acceptance can be very slow. Do not call under critical locks. + */ +static void accept_page(struct page *page, unsigned int order) +{ + phys_addr_t start = page_to_phys(page); + int i; + + if (!PageUnaccepted(page)) + return; + + accept_memory(start, start + (PAGE_SIZE << order)); + __ClearPageUnaccepted(page); + + /* Assert that there is no PageUnaccepted() on tail pages */ + if (IS_ENABLED(CONFIG_DEBUG_VM)) { + for (i = 1; i < (1 << order); i++) + VM_BUG_ON_PAGE(PageUnaccepted(page + i), page + i); + } +} + +static bool page_contains_unaccepted(struct page *page, unsigned int order) +{ + phys_addr_t start = page_to_phys(page); + phys_addr_t end = start + (PAGE_SIZE << order); + + return range_contains_unaccepted_memory(start, end); +} + /* * Freeing function for a buddy system allocator. * @@ -1027,6 +1062,7 @@ static inline void __free_one_page(struct page *page, unsigned long combined_pfn; struct page *buddy; bool to_tail; + bool page_needs_acceptance = false; VM_BUG_ON(!zone_is_initialized(zone)); VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP, page); @@ -1038,6 +1074,11 @@ static inline void __free_one_page(struct page *page, VM_BUG_ON_PAGE(pfn & ((1 << order) - 1), page); VM_BUG_ON_PAGE(bad_range(zone, page), page); + if (PageUnaccepted(page)) { + page_needs_acceptance = true; + __ClearPageUnaccepted(page); + } + while (order < MAX_ORDER - 1) { if (compaction_capture(capc, page, order, migratetype)) { __mod_zone_freepage_state(zone, -(1 << order), @@ -1072,6 +1113,13 @@ static inline void __free_one_page(struct page *page, clear_page_guard(zone, buddy, order, migratetype); else del_page_from_free_list(buddy, zone, order); + + /* Mark page unaccepted if any of merged pages were unaccepted */ + if (PageUnaccepted(buddy)) { + page_needs_acceptance = true; + __ClearPageUnaccepted(buddy); + } + combined_pfn = buddy_pfn & pfn; page = page + (combined_pfn - pfn); pfn = combined_pfn; @@ -1081,6 +1129,23 @@ static inline void __free_one_page(struct page *page, done_merging: set_buddy_order(page, order); + /* + * The page gets marked as PageUnaccepted() if any of merged-in pages + * is PageUnaccepted(). + * + * New pages, just being added to buddy allocator, do not have + * PageUnaccepted() set. FPI_UNACCEPTED_SLOWPATH indicates that the + * page is new and page_contains_unaccepted() check is required to + * determinate if acceptance is required. + * + * Avoid calling page_contains_unaccepted() if it is known that the page + * needs acceptance. It can be costly. + */ + if (!page_needs_acceptance && (fpi_flags & FPI_UNACCEPTED_SLOWPATH)) + page_needs_acceptance = page_contains_unaccepted(page, order); + if (page_needs_acceptance) + __SetPageUnaccepted(page); + if (fpi_flags & FPI_TO_TAIL) to_tail = true; else if (is_shuffle_order(order)) @@ -1164,7 +1229,13 @@ int split_free_page(struct page *free_page, static inline bool page_expected_state(struct page *page, unsigned long check_flags) { - if (unlikely(atomic_read(&page->_mapcount) != -1)) + /* + * The page must not be mapped to userspace and must not have + * a PageType other than listed in PAGE_TYPES_EXPECTED. + * + * Note, bit cleared means the page type is set. + */ + if (unlikely((atomic_read(&page->_mapcount) | PAGE_TYPES_EXPECTED) != -1)) return false; if (unlikely((unsigned long)page->mapping | @@ -1669,7 +1740,9 @@ void __free_pages_core(struct page *page, unsigned int order) * Bypass PCP and place fresh pages right to the tail, primarily * relevant for memory onlining. */ - __free_pages_ok(page, order, FPI_TO_TAIL | FPI_SKIP_KASAN_POISON); + __free_pages_ok(page, order, + FPI_TO_TAIL | FPI_SKIP_KASAN_POISON | + FPI_UNACCEPTED_SLOWPATH); } #ifdef CONFIG_NUMA @@ -1822,6 +1895,9 @@ static void __init deferred_free_range(unsigned long pfn, return; } + /* Accept chunks smaller than page-block upfront */ + accept_memory(PFN_PHYS(pfn), PFN_PHYS(pfn + nr_pages)); + for (i = 0; i < nr_pages; i++, page++, pfn++) { if ((pfn & (pageblock_nr_pages - 1)) == 0) set_pageblock_migratetype(page, MIGRATE_MOVABLE); @@ -2281,6 +2357,13 @@ static inline void expand(struct zone *zone, struct page *page, if (set_page_guard(zone, &page[size], high, migratetype)) continue; + /* + * Transfer PageUnaccepted() to the newly split pages so + * they can be accepted after dropping the zone lock. + */ + if (PageUnaccepted(page)) + __SetPageUnaccepted(&page[size]); + add_to_free_list(&page[size], zone, high, migratetype); set_buddy_order(&page[size], high); } @@ -2411,6 +2494,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order, */ kernel_unpoison_pages(page, 1 << order); + accept_page(page, order); + /* * As memory initialization might be integrated into KASAN, * KASAN unpoisoning and memory initializion code must be From patchwork Tue Jun 14 12:02:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880959 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A52F1C43334 for ; Tue, 14 Jun 2022 12:02:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E1A2B8D023C; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D980A8D023F; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AAB3D8D023C; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 952F28D023D for ; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 49C96121705 for ; Tue, 14 Jun 2022 12:02:40 +0000 (UTC) X-FDA: 79576704480.02.6EC3344 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf18.hostedemail.com (Postfix) with ESMTP id 49B251C0086 for ; Tue, 14 Jun 2022 12:02:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208157; x=1686744157; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GFpQ+/mE/sWJAfnOObzg9+iqOuEtZS5nHGCJB+ROj7k=; b=LGLx92gl00T7oEKRYLDoL/2ZNKfBLRtj0zh33ICgXtbgH80EnCZCT0rs 8NJsuyXTagT0m9V7I6KJoCJZtEB/JJW22gt4/87vf5hnT3F/7yHvdGc2a l+9f+5oHcdtURNjBwT3LTN98fEt3wGvmF7bJ/j+F4PqZxvc9/dDLhaOZK NBMXJCzFaMQxwIZDWiEwf24o+FrlXZUlQojvE74+NiFvDo2eji3HBijjl St/kthJqXAfsfCO0wXYWLKRsYgBv+VPtT5cUi93v7HxZLnZh+9AETV+ZV DmfZvx9nE5o+VAS8FgCNqQolFzMATmqfofAzsqxmsgcfU0XN6lAz4XH8J A==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="279633978" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="279633978" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="588429901" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga007.fm.intel.com with ESMTP; 14 Jun 2022 05:02:28 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 94DE65D3; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 03/14] mm: Report unaccepted memory in meminfo Date: Tue, 14 Jun 2022 15:02:20 +0300 Message-Id: <20220614120231.48165-4-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208158; a=rsa-sha256; cv=none; b=q/OesI112cfptFdJ4vWSrA7yKAI9zhLV7BTEzNHSiwCvGULszsPnZ11RjrYeZlJP0ID76S XreEXFC5qphbE6lAKk/oKQvwiknpzg0A/1ssuV5/qW/5U+cB4/IUGgR25m5kCt0j160xyM fxTfXkBXahELXzn1orWrnqa8jw92evY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208158; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jL0eKVxAHX2MUtENKfniSWI5GFyZiu7u+0Ge5XWVgqM=; b=3nMhzslknQ6nQ74Lm8jVUWi1QdgfBfs1+ciP5AXB3SNzwaXaPGyTpXLiZuNUpD3rCE1YYM Pqsenysf3sffpqDjqYtxcNxCMKR/c/9MYusWwv6ZeYVseoZx+TGNLusGSYWUS3USdeEX0S WBaW8b7nLOAtFH37q6pFJSH8xC6Efy8= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=LGLx92gl; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf18.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=kirill.shutemov@linux.intel.com Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=LGLx92gl; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf18.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=kirill.shutemov@linux.intel.com X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 49B251C0086 X-Stat-Signature: cqcetu6jfffsmxcpcegus8nyd79srr8j X-HE-Tag: 1655208156-160255 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Track amount of unaccepted memory and report it in /proc/meminfo and in node meminfo. Signed-off-by: Kirill A. Shutemov --- drivers/base/node.c | 7 +++++++ fs/proc/meminfo.c | 5 +++++ include/linux/mmzone.h | 1 + mm/page_alloc.c | 9 ++++++++- mm/vmstat.c | 1 + 5 files changed, 22 insertions(+), 1 deletion(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 0ac6376ef7a1..fc7cf4d91eb6 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -446,6 +446,9 @@ static ssize_t node_read_meminfo(struct device *dev, "Node %d ShmemPmdMapped: %8lu kB\n" "Node %d FileHugePages: %8lu kB\n" "Node %d FilePmdMapped: %8lu kB\n" +#endif +#ifdef CONFIG_UNACCEPTED_MEMORY + "Node %d UnacceptedPages: %8lu kB\n" #endif , nid, K(node_page_state(pgdat, NR_FILE_DIRTY)), @@ -474,6 +477,10 @@ static ssize_t node_read_meminfo(struct device *dev, nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED)), nid, K(node_page_state(pgdat, NR_FILE_THPS)), nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED)) +#endif +#ifdef CONFIG_UNACCEPTED_MEMORY + , + nid, K(node_page_state(pgdat, NR_UNACCEPTED)) #endif ); len += hugetlb_report_node_meminfo(buf, len, nid); diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 6e89f0e2fd20..796544e50365 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -153,6 +153,11 @@ static int meminfo_proc_show(struct seq_file *m, void *v) global_zone_page_state(NR_FREE_CMA_PAGES)); #endif +#ifdef CONFIG_UNACCEPTED_MEMORY + show_val_kb(m, "UnacceptedPages:", + global_node_page_state(NR_UNACCEPTED)); +#endif + hugetlb_report_meminfo(m); arch_report_meminfo(m); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index aab70355d64f..aa08cd7eaaf5 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -212,6 +212,7 @@ enum node_stat_item { NR_FOLL_PIN_ACQUIRED, /* via: pin_user_page(), gup flag: FOLL_PIN */ NR_FOLL_PIN_RELEASED, /* pages returned via unpin_user_page() */ NR_KERNEL_STACK_KB, /* measured in KiB */ + NR_UNACCEPTED, #if IS_ENABLED(CONFIG_SHADOW_CALL_STACK) NR_KERNEL_SCS_KB, /* measured in KiB */ #endif diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 279c2746aaa8..6316d695a567 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1012,6 +1012,7 @@ static void accept_page(struct page *page, unsigned int order) accept_memory(start, start + (PAGE_SIZE << order)); __ClearPageUnaccepted(page); + mod_node_page_state(page_pgdat(page), NR_UNACCEPTED, -(1 << order)); /* Assert that there is no PageUnaccepted() on tail pages */ if (IS_ENABLED(CONFIG_DEBUG_VM)) { @@ -1063,6 +1064,7 @@ static inline void __free_one_page(struct page *page, struct page *buddy; bool to_tail; bool page_needs_acceptance = false; + int nr_unaccepted = 0; VM_BUG_ON(!zone_is_initialized(zone)); VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP, page); @@ -1076,6 +1078,7 @@ static inline void __free_one_page(struct page *page, if (PageUnaccepted(page)) { page_needs_acceptance = true; + nr_unaccepted += 1 << order; __ClearPageUnaccepted(page); } @@ -1117,6 +1120,7 @@ static inline void __free_one_page(struct page *page, /* Mark page unaccepted if any of merged pages were unaccepted */ if (PageUnaccepted(buddy)) { page_needs_acceptance = true; + nr_unaccepted += 1 << order; __ClearPageUnaccepted(buddy); } @@ -1143,8 +1147,11 @@ static inline void __free_one_page(struct page *page, */ if (!page_needs_acceptance && (fpi_flags & FPI_UNACCEPTED_SLOWPATH)) page_needs_acceptance = page_contains_unaccepted(page, order); - if (page_needs_acceptance) + if (page_needs_acceptance) { __SetPageUnaccepted(page); + __mod_node_page_state(page_pgdat(page), NR_UNACCEPTED, + (1 << order) - nr_unaccepted); + } if (fpi_flags & FPI_TO_TAIL) to_tail = true; diff --git a/mm/vmstat.c b/mm/vmstat.c index 373d2730fcf2..4e12d22f1e04 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1236,6 +1236,7 @@ const char * const vmstat_text[] = { "nr_foll_pin_acquired", "nr_foll_pin_released", "nr_kernel_stack", + "nr_unaccepted", #if IS_ENABLED(CONFIG_SHADOW_CALL_STACK) "nr_shadow_call_stack", #endif From patchwork Tue Jun 14 12:02:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880957 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 968ADC43334 for ; Tue, 14 Jun 2022 12:02:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 26FC38D0001; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 246CD8D023C; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E9398D0001; Tue, 14 Jun 2022 08:02:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 014A08D023C for ; Tue, 14 Jun 2022 08:02:39 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id CB8D16120F for ; Tue, 14 Jun 2022 12:02:39 +0000 (UTC) X-FDA: 79576704438.31.A35C4CA Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf28.hostedemail.com (Postfix) with ESMTP id EF5B4C00A9 for ; Tue, 14 Jun 2022 12:02:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208159; x=1686744159; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=z3vk4MiFaMowdZHoqTpUcMK79OckKcSoieuVm3Gi93o=; b=S8ssF3IRbE2k34OTwxIC6UxuYMUaqlhu7yOzUqU7KN2jxJ2JLO9esFg8 6KYyzXs7LbexuAccpLpYdjS0DaCFZ4AEGsBDtw87VBy4BtyxsdcPUXRL2 oSwJ1e0nF9q1vt9TkttTyKL00uxB9l0d15/LBLhVuDBUsReUADzSmaj2i exUgdz8uVLbkSEu6gcVU3KbLWvNUKwG2zQhBw0o9IYszEylhW3ZBeyHb6 KMUUhk/f+4tCZ7DrYlBcY9rJsdO+Jz2qfEH1Ko4UUg/T0sY8Eur3yv7hg N3Jq+O3r0zp1aSeY28zSxqeDP/qIeDSo3IN2ufJQkMhYrWyy8NulJuStm Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="364934645" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="364934645" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="652001263" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga004.fm.intel.com with ESMTP; 14 Jun 2022 05:02:28 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 9CB1660E; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 04/14] efi/x86: Get full memory map in allocate_e820() Date: Tue, 14 Jun 2022 15:02:21 +0300 Message-Id: <20220614120231.48165-5-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208159; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZpVgGsM9M9IiCP2g7i9gOKhp4UtHGI9JInMO/6uWhoQ=; b=dZhNtbfkrBoa+cB3VeNqBH3cq+TPsyJM+uAgE0lSdsrcYMfRPfwi8etvrBIQr65mZbfw7u MQr3QoKN7IRBchgXcbGiPOLqob03tH5bMybfSW2dYOM3O3ZyuocvUwj0Gi+HcT0z/OSwaQ 2wjjjGH3NtXvDUl6Apya/I65CVj8fBw= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=S8ssF3IR; spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208159; a=rsa-sha256; cv=none; b=tBZiEgcq5EPSWd96uJa7JfHo1XDJ4ub3Qj2AALQpf8miDCumOqC5niHHcGIT5RZ73AvhQJ ONWHbWAya5hp9696QBdnGEFnWmoBJvS4M2AGMkANpX/eni6Katw7SQn6ZIAcTUbTNZpUhm d+TzbacT7RArN4W+MBmx0abjQErxCSw= X-Stat-Signature: yh6yzoi6ja9y5jpcj7d1pbostzxnk7yp X-Rspamd-Queue-Id: EF5B4C00A9 X-Rspamd-Server: rspam11 X-Rspam-User: Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=S8ssF3IR; spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com X-HE-Tag: 1655208158-709847 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently allocate_e820() only interested in the size of map and size of memory descriptor to determine how many e820 entries the kernel needs. UEFI Specification version 2.9 introduces a new memory type -- unaccepted memory. To track unaccepted memory kernel needs to allocate a bitmap. The size of the bitmap is dependent on the maximum physical address present in the system. A full memory map is required to find the maximum address. Modify allocate_e820() to get a full memory map. This is preparation for the next patch that implements handling of unaccepted memory in EFI stub. Signed-off-by: Kirill A. Shutemov Reviewed-by: Borislav Petkov --- drivers/firmware/efi/libstub/x86-stub.c | 28 +++++++++++-------------- 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index 05ae8bcc9d67..504955368934 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -672,31 +672,27 @@ static efi_status_t alloc_e820ext(u32 nr_desc, struct setup_data **e820ext, } static efi_status_t allocate_e820(struct boot_params *params, + struct efi_boot_memmap *map, struct setup_data **e820ext, u32 *e820ext_size) { - unsigned long map_size, desc_size, map_key; efi_status_t status; - __u32 nr_desc, desc_version; + __u32 nr_desc; - /* Only need the size of the mem map and size of each mem descriptor */ - map_size = 0; - status = efi_bs_call(get_memory_map, &map_size, NULL, &map_key, - &desc_size, &desc_version); - if (status != EFI_BUFFER_TOO_SMALL) - return (status != EFI_SUCCESS) ? status : EFI_UNSUPPORTED; - - nr_desc = map_size / desc_size + EFI_MMAP_NR_SLACK_SLOTS; + status = efi_get_memory_map(map); + if (status != EFI_SUCCESS) + return status; - if (nr_desc > ARRAY_SIZE(params->e820_table)) { - u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table); + nr_desc = *map->map_size / *map->desc_size; + if (nr_desc > ARRAY_SIZE(params->e820_table) - EFI_MMAP_NR_SLACK_SLOTS) { + u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table) + + EFI_MMAP_NR_SLACK_SLOTS; status = alloc_e820ext(nr_e820ext, e820ext, e820ext_size); - if (status != EFI_SUCCESS) - return status; } - return EFI_SUCCESS; + efi_bs_call(free_pool, *map->map); + return status; } struct exit_boot_struct { @@ -745,7 +741,7 @@ static efi_status_t exit_boot(struct boot_params *boot_params, void *handle) priv.boot_params = boot_params; priv.efi = &boot_params->efi_info; - status = allocate_e820(boot_params, &e820ext, &e820ext_size); + status = allocate_e820(boot_params, &map, &e820ext, &e820ext_size); if (status != EFI_SUCCESS) return status; From patchwork Tue Jun 14 12:02:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880962 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72930CCA47A for ; Tue, 14 Jun 2022 12:02:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6B6B68D0241; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 642E28D0242; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E3588D0241; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1C9248D023D for ; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id EEDE235E9E for ; Tue, 14 Jun 2022 12:02:44 +0000 (UTC) X-FDA: 79576704648.18.95EAAE5 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by imf10.hostedemail.com (Postfix) with ESMTP id 36B3EC0090 for ; Tue, 14 Jun 2022 12:02:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208164; x=1686744164; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cjIhhBNmMg3rAyBhnGeB4/sD2KLKXglIHIAjstpNJxM=; b=iTXWMXYpopVru18E8l4brVvUCJNOmqpGmVF+70Aawby2Hanx/n3bTXtb 5Bzv+hi1t/Zk0EauQSbThB5vZpS629bzgZkWxDnurSjsJOsOu8BE2XhVK 3rkfb5PO884gcLPNOQYARrWh/cyTGWLwZIccmVygMRIgMsLkmIn6KEmh0 /KH60J3VDAFOmHlV60ng8bpys/QXfyzbL0YNH8Xp43tEmHdR4r8hBM/tF irLfcLnlrBLSeGKPZrp7NnqUqDs0qmuB4IeTfteM9t8cY648DHuy+uMiR DBSMsjlPFyW4QFUkv2xgBkZrxmE8Yu0GpkqO/7QyyRK7gA95X/ni40QRZ w==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="278636125" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="278636125" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="910967508" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga005.fm.intel.com with ESMTP; 14 Jun 2022 05:02:35 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id A4CBA702; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 05/14] x86/boot: Add infrastructure required for unaccepted memory support Date: Tue, 14 Jun 2022 15:02:22 +0300 Message-Id: <20220614120231.48165-6-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208164; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2ovBsVtCcpYMVm4FgSyzNMdXAm9s5QEYZI++3oRUSAk=; b=tKvu7+Qp5YzshoL8C4KxujlIiwgyKcI1aYkj7LyfN7MdpdTUAlWnMb5YF3N6zIus8zSxyW NGTEc4Uu0kb+vxTspRWxgWyV0+4jXljumN03M68mMIxAR5Z+oFp8lDMeIWgcEk2r545eNF 4K1nh/aCNpjxcUhW29ZiV0/PkwbhVbs= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=iTXWMXYp; spf=none (imf10.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.115) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208164; a=rsa-sha256; cv=none; b=Lu0xMauwJ8K3oJm7YjwPFfFmuR+eSJx83c2UjZNN0+cZ3MQWK4JeR1okDYW7K55mzYuwk5 BlubmdevylQzB5UdIJkkeiwpzcS2ZttU3nzZI3jlOoY0bT3HlwT02Sby29qucERmdyebcv EkBr9fOiOYkfKYxFcNdmFQEg71Sz3qo= X-Stat-Signature: cue6m6ubkuie9x1rq4jgwkzbsrgcfjdj X-Rspamd-Queue-Id: 36B3EC0090 X-Rspamd-Server: rspam11 X-Rspam-User: Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=iTXWMXYp; spf=none (imf10.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.115) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com X-HE-Tag: 1655208163-665003 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Pull functionality from the main kernel headers and lib/ that is required for unaccepted memory support. This is preparatory patch. The users for the functionality will come in following patches. Signed-off-by: Kirill A. Shutemov --- arch/x86/boot/bitops.h | 40 ++++++++++++ arch/x86/boot/compressed/align.h | 14 +++++ arch/x86/boot/compressed/bitmap.c | 43 +++++++++++++ arch/x86/boot/compressed/bitmap.h | 49 +++++++++++++++ arch/x86/boot/compressed/bits.h | 36 +++++++++++ arch/x86/boot/compressed/compiler.h | 9 +++ arch/x86/boot/compressed/find.c | 54 ++++++++++++++++ arch/x86/boot/compressed/find.h | 80 ++++++++++++++++++++++++ arch/x86/boot/compressed/math.h | 37 +++++++++++ arch/x86/boot/compressed/minmax.h | 61 ++++++++++++++++++ arch/x86/boot/compressed/pgtable_types.h | 25 ++++++++ 11 files changed, 448 insertions(+) create mode 100644 arch/x86/boot/compressed/align.h create mode 100644 arch/x86/boot/compressed/bitmap.c create mode 100644 arch/x86/boot/compressed/bitmap.h create mode 100644 arch/x86/boot/compressed/bits.h create mode 100644 arch/x86/boot/compressed/compiler.h create mode 100644 arch/x86/boot/compressed/find.c create mode 100644 arch/x86/boot/compressed/find.h create mode 100644 arch/x86/boot/compressed/math.h create mode 100644 arch/x86/boot/compressed/minmax.h create mode 100644 arch/x86/boot/compressed/pgtable_types.h diff --git a/arch/x86/boot/bitops.h b/arch/x86/boot/bitops.h index 02e1dea11d94..61eb820ee402 100644 --- a/arch/x86/boot/bitops.h +++ b/arch/x86/boot/bitops.h @@ -41,4 +41,44 @@ static inline void set_bit(int nr, void *addr) asm("btsl %1,%0" : "+m" (*(u32 *)addr) : "Ir" (nr)); } +static __always_inline void __set_bit(long nr, volatile unsigned long *addr) +{ + asm volatile(__ASM_SIZE(bts) " %1,%0" : : "m" (*(volatile long *) addr), + "Ir" (nr) : "memory"); +} + +static __always_inline void __clear_bit(long nr, volatile unsigned long *addr) +{ + asm volatile(__ASM_SIZE(btr) " %1,%0" : : "m" (*(volatile long *) addr), + "Ir" (nr) : "memory"); +} + +/** + * __ffs - find first set bit in word + * @word: The word to search + * + * Undefined if no bit exists, so code should check against 0 first. + */ +static __always_inline unsigned long __ffs(unsigned long word) +{ + asm("rep; bsf %1,%0" + : "=r" (word) + : "rm" (word)); + return word; +} + +/** + * ffz - find first zero bit in word + * @word: The word to search + * + * Undefined if no zero exists, so code should check against ~0UL first. + */ +static __always_inline unsigned long ffz(unsigned long word) +{ + asm("rep; bsf %1,%0" + : "=r" (word) + : "r" (~word)); + return word; +} + #endif /* BOOT_BITOPS_H */ diff --git a/arch/x86/boot/compressed/align.h b/arch/x86/boot/compressed/align.h new file mode 100644 index 000000000000..7ccabbc5d1b8 --- /dev/null +++ b/arch/x86/boot/compressed/align.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_ALIGN_H +#define BOOT_ALIGN_H +#define _LINUX_ALIGN_H /* Inhibit inclusion of */ + +/* @a is a power of 2 value */ +#define ALIGN(x, a) __ALIGN_KERNEL((x), (a)) +#define ALIGN_DOWN(x, a) __ALIGN_KERNEL((x) - ((a) - 1), (a)) +#define __ALIGN_MASK(x, mask) __ALIGN_KERNEL_MASK((x), (mask)) +#define PTR_ALIGN(p, a) ((typeof(p))ALIGN((unsigned long)(p), (a))) +#define PTR_ALIGN_DOWN(p, a) ((typeof(p))ALIGN_DOWN((unsigned long)(p), (a))) +#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0) + +#endif diff --git a/arch/x86/boot/compressed/bitmap.c b/arch/x86/boot/compressed/bitmap.c new file mode 100644 index 000000000000..789ecadeb521 --- /dev/null +++ b/arch/x86/boot/compressed/bitmap.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "bitmap.h" + +void __bitmap_set(unsigned long *map, unsigned int start, int len) +{ + unsigned long *p = map + BIT_WORD(start); + const unsigned int size = start + len; + int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG); + unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start); + + while (len - bits_to_set >= 0) { + *p |= mask_to_set; + len -= bits_to_set; + bits_to_set = BITS_PER_LONG; + mask_to_set = ~0UL; + p++; + } + if (len) { + mask_to_set &= BITMAP_LAST_WORD_MASK(size); + *p |= mask_to_set; + } +} + +void __bitmap_clear(unsigned long *map, unsigned int start, int len) +{ + unsigned long *p = map + BIT_WORD(start); + const unsigned int size = start + len; + int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG); + unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start); + + while (len - bits_to_clear >= 0) { + *p &= ~mask_to_clear; + len -= bits_to_clear; + bits_to_clear = BITS_PER_LONG; + mask_to_clear = ~0UL; + p++; + } + if (len) { + mask_to_clear &= BITMAP_LAST_WORD_MASK(size); + *p &= ~mask_to_clear; + } +} diff --git a/arch/x86/boot/compressed/bitmap.h b/arch/x86/boot/compressed/bitmap.h new file mode 100644 index 000000000000..35357f5feda2 --- /dev/null +++ b/arch/x86/boot/compressed/bitmap.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_BITMAP_H +#define BOOT_BITMAP_H +#define __LINUX_BITMAP_H /* Inhibit inclusion of */ + +#include "../bitops.h" +#include "../string.h" +#include "align.h" + +#define BITMAP_MEM_ALIGNMENT 8 +#define BITMAP_MEM_MASK (BITMAP_MEM_ALIGNMENT - 1) + +#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1))) +#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1))) + +#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) + +void __bitmap_set(unsigned long *map, unsigned int start, int len); +void __bitmap_clear(unsigned long *map, unsigned int start, int len); + +static __always_inline void bitmap_set(unsigned long *map, unsigned int start, + unsigned int nbits) +{ + if (__builtin_constant_p(nbits) && nbits == 1) + __set_bit(start, map); + else if (__builtin_constant_p(start & BITMAP_MEM_MASK) && + IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) && + __builtin_constant_p(nbits & BITMAP_MEM_MASK) && + IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT)) + memset((char *)map + start / 8, 0xff, nbits / 8); + else + __bitmap_set(map, start, nbits); +} + +static __always_inline void bitmap_clear(unsigned long *map, unsigned int start, + unsigned int nbits) +{ + if (__builtin_constant_p(nbits) && nbits == 1) + __clear_bit(start, map); + else if (__builtin_constant_p(start & BITMAP_MEM_MASK) && + IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) && + __builtin_constant_p(nbits & BITMAP_MEM_MASK) && + IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT)) + memset((char *)map + start / 8, 0, nbits / 8); + else + __bitmap_clear(map, start, nbits); +} + +#endif diff --git a/arch/x86/boot/compressed/bits.h b/arch/x86/boot/compressed/bits.h new file mode 100644 index 000000000000..b0ffa007ee19 --- /dev/null +++ b/arch/x86/boot/compressed/bits.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_BITS_H +#define BOOT_BITS_H +#define __LINUX_BITS_H /* Inhibit inclusion of */ + +#ifdef __ASSEMBLY__ +#define _AC(X,Y) X +#define _AT(T,X) X +#else +#define __AC(X,Y) (X##Y) +#define _AC(X,Y) __AC(X,Y) +#define _AT(T,X) ((T)(X)) +#endif + +#define _UL(x) (_AC(x, UL)) +#define _ULL(x) (_AC(x, ULL)) +#define UL(x) (_UL(x)) +#define ULL(x) (_ULL(x)) + +#define BIT(nr) (UL(1) << (nr)) +#define BIT_ULL(nr) (ULL(1) << (nr)) +#define BIT_MASK(nr) (UL(1) << ((nr) % BITS_PER_LONG)) +#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) +#define BIT_ULL_MASK(nr) (ULL(1) << ((nr) % BITS_PER_LONG_LONG)) +#define BIT_ULL_WORD(nr) ((nr) / BITS_PER_LONG_LONG) +#define BITS_PER_BYTE 8 + +#define GENMASK(h, l) \ + (((~UL(0)) - (UL(1) << (l)) + 1) & \ + (~UL(0) >> (BITS_PER_LONG - 1 - (h)))) + +#define GENMASK_ULL(h, l) \ + (((~ULL(0)) - (ULL(1) << (l)) + 1) & \ + (~ULL(0) >> (BITS_PER_LONG_LONG - 1 - (h)))) + +#endif diff --git a/arch/x86/boot/compressed/compiler.h b/arch/x86/boot/compressed/compiler.h new file mode 100644 index 000000000000..452c4c0844b9 --- /dev/null +++ b/arch/x86/boot/compressed/compiler.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_COMPILER_H +#define BOOT_COMPILER_H +#define __LINUX_COMPILER_H /* Inhibit inclusion of */ + +# define likely(x) __builtin_expect(!!(x), 1) +# define unlikely(x) __builtin_expect(!!(x), 0) + +#endif diff --git a/arch/x86/boot/compressed/find.c b/arch/x86/boot/compressed/find.c new file mode 100644 index 000000000000..839be91aae52 --- /dev/null +++ b/arch/x86/boot/compressed/find.c @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include "bitmap.h" +#include "find.h" +#include "math.h" +#include "minmax.h" + +static __always_inline unsigned long swab(const unsigned long y) +{ +#if __BITS_PER_LONG == 64 + return __builtin_bswap32(y); +#else /* __BITS_PER_LONG == 32 */ + return __builtin_bswap64(y); +#endif +} + +unsigned long _find_next_bit(const unsigned long *addr1, + const unsigned long *addr2, unsigned long nbits, + unsigned long start, unsigned long invert, unsigned long le) +{ + unsigned long tmp, mask; + + if (unlikely(start >= nbits)) + return nbits; + + tmp = addr1[start / BITS_PER_LONG]; + if (addr2) + tmp &= addr2[start / BITS_PER_LONG]; + tmp ^= invert; + + /* Handle 1st word. */ + mask = BITMAP_FIRST_WORD_MASK(start); + if (le) + mask = swab(mask); + + tmp &= mask; + + start = round_down(start, BITS_PER_LONG); + + while (!tmp) { + start += BITS_PER_LONG; + if (start >= nbits) + return nbits; + + tmp = addr1[start / BITS_PER_LONG]; + if (addr2) + tmp &= addr2[start / BITS_PER_LONG]; + tmp ^= invert; + } + + if (le) + tmp = swab(tmp); + + return min(start + __ffs(tmp), nbits); +} diff --git a/arch/x86/boot/compressed/find.h b/arch/x86/boot/compressed/find.h new file mode 100644 index 000000000000..799176972ebc --- /dev/null +++ b/arch/x86/boot/compressed/find.h @@ -0,0 +1,80 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_FIND_H +#define BOOT_FIND_H +#define __LINUX_FIND_H /* Inhibit inclusion of */ + +#include "../bitops.h" +#include "align.h" +#include "bits.h" +#include "compiler.h" + +unsigned long _find_next_bit(const unsigned long *addr1, + const unsigned long *addr2, unsigned long nbits, + unsigned long start, unsigned long invert, unsigned long le); + +/** + * find_next_bit - find the next set bit in a memory region + * @addr: The address to base the search on + * @offset: The bitnumber to start searching at + * @size: The bitmap size in bits + * + * Returns the bit number for the next set bit + * If no bits are set, returns @size. + */ +static inline +unsigned long find_next_bit(const unsigned long *addr, unsigned long size, + unsigned long offset) +{ + if (small_const_nbits(size)) { + unsigned long val; + + if (unlikely(offset >= size)) + return size; + + val = *addr & GENMASK(size - 1, offset); + return val ? __ffs(val) : size; + } + + return _find_next_bit(addr, NULL, size, offset, 0UL, 0); +} + +/** + * find_next_zero_bit - find the next cleared bit in a memory region + * @addr: The address to base the search on + * @offset: The bitnumber to start searching at + * @size: The bitmap size in bits + * + * Returns the bit number of the next zero bit + * If no bits are zero, returns @size. + */ +static inline +unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size, + unsigned long offset) +{ + if (small_const_nbits(size)) { + unsigned long val; + + if (unlikely(offset >= size)) + return size; + + val = *addr | ~GENMASK(size - 1, offset); + return val == ~0UL ? size : ffz(val); + } + + return _find_next_bit(addr, NULL, size, offset, ~0UL, 0); +} + +/** + * for_each_set_bitrange_from - iterate over all set bit ranges [b; e) + * @b: bit offset of start of current bitrange (first set bit); must be initialized + * @e: bit offset of end of current bitrange (first unset bit) + * @addr: bitmap address to base the search on + * @size: bitmap size in number of bits + */ +#define for_each_set_bitrange_from(b, e, addr, size) \ + for ((b) = find_next_bit((addr), (size), (b)), \ + (e) = find_next_zero_bit((addr), (size), (b) + 1); \ + (b) < (size); \ + (b) = find_next_bit((addr), (size), (e) + 1), \ + (e) = find_next_zero_bit((addr), (size), (b) + 1)) +#endif diff --git a/arch/x86/boot/compressed/math.h b/arch/x86/boot/compressed/math.h new file mode 100644 index 000000000000..f7eede84bbc2 --- /dev/null +++ b/arch/x86/boot/compressed/math.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_MATH_H +#define BOOT_MATH_H +#define __LINUX_MATH_H /* Inhibit inclusion of */ + +/* + * + * This looks more complex than it should be. But we need to + * get the type for the ~ right in round_down (it needs to be + * as wide as the result!), and we want to evaluate the macro + * arguments just once each. + */ +#define __round_mask(x, y) ((__typeof__(x))((y)-1)) + +/** + * round_up - round up to next specified power of 2 + * @x: the value to round + * @y: multiple to round up to (must be a power of 2) + * + * Rounds @x up to next multiple of @y (which must be a power of 2). + * To perform arbitrary rounding up, use roundup() below. + */ +#define round_up(x, y) ((((x)-1) | __round_mask(x, y))+1) + +/** + * round_down - round down to next specified power of 2 + * @x: the value to round + * @y: multiple to round down to (must be a power of 2) + * + * Rounds @x down to next multiple of @y (which must be a power of 2). + * To perform arbitrary rounding down, use rounddown() below. + */ +#define round_down(x, y) ((x) & ~__round_mask(x, y)) + +#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d)) + +#endif diff --git a/arch/x86/boot/compressed/minmax.h b/arch/x86/boot/compressed/minmax.h new file mode 100644 index 000000000000..4efd05673260 --- /dev/null +++ b/arch/x86/boot/compressed/minmax.h @@ -0,0 +1,61 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_MINMAX_H +#define BOOT_MINMAX_H +#define __LINUX_MINMAX_H /* Inhibit inclusion of */ + +/* + * This returns a constant expression while determining if an argument is + * a constant expression, most importantly without evaluating the argument. + * Glory to Martin Uecker + */ +#define __is_constexpr(x) \ + (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8))) + +/* + * min()/max()/clamp() macros must accomplish three things: + * + * - avoid multiple evaluations of the arguments (so side-effects like + * "x++" happen only once) when non-constant. + * - perform strict type-checking (to generate warnings instead of + * nasty runtime surprises). See the "unnecessary" pointer comparison + * in __typecheck(). + * - retain result as a constant expressions when called with only + * constant expressions (to avoid tripping VLA warnings in stack + * allocation usage). + */ +#define __typecheck(x, y) \ + (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) + +#define __no_side_effects(x, y) \ + (__is_constexpr(x) && __is_constexpr(y)) + +#define __safe_cmp(x, y) \ + (__typecheck(x, y) && __no_side_effects(x, y)) + +#define __cmp(x, y, op) ((x) op (y) ? (x) : (y)) + +#define __cmp_once(x, y, unique_x, unique_y, op) ({ \ + typeof(x) unique_x = (x); \ + typeof(y) unique_y = (y); \ + __cmp(unique_x, unique_y, op); }) + +#define __careful_cmp(x, y, op) \ + __builtin_choose_expr(__safe_cmp(x, y), \ + __cmp(x, y, op), \ + __cmp_once(x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y), op)) + +/** + * min - return minimum of two values of the same or compatible types + * @x: first value + * @y: second value + */ +#define min(x, y) __careful_cmp(x, y, <) + +/** + * max - return maximum of two values of the same or compatible types + * @x: first value + * @y: second value + */ +#define max(x, y) __careful_cmp(x, y, >) + +#endif diff --git a/arch/x86/boot/compressed/pgtable_types.h b/arch/x86/boot/compressed/pgtable_types.h new file mode 100644 index 000000000000..8f1d87a69efc --- /dev/null +++ b/arch/x86/boot/compressed/pgtable_types.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef BOOT_COMPRESSED_PGTABLE_TYPES_H +#define BOOT_COMPRESSED_PGTABLE_TYPES_H +#define _ASM_X86_PGTABLE_DEFS_H /* Inhibit inclusion of */ + +#define PAGE_SHIFT 12 + +#ifdef CONFIG_X86_64 +#define PTE_SHIFT 9 +#elif defined CONFIG_X86_PAE +#define PTE_SHIFT 9 +#else /* 2-level */ +#define PTE_SHIFT 10 +#endif + +enum pg_level { + PG_LEVEL_NONE, + PG_LEVEL_4K, + PG_LEVEL_2M, + PG_LEVEL_1G, + PG_LEVEL_512G, + PG_LEVEL_NUM +}; + +#endif From patchwork Tue Jun 14 12:02:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880964 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B62B3C433EF for ; Tue, 14 Jun 2022 12:02:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C615B8D0242; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A26458D0245; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C8DE8D023D; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 41EF58D0242 for ; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 0F2D221973 for ; Tue, 14 Jun 2022 12:02:45 +0000 (UTC) X-FDA: 79576704690.06.34127CD Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf24.hostedemail.com (Postfix) with ESMTP id 48A3818008D for ; Tue, 14 Jun 2022 12:02:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208164; x=1686744164; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=osAsv6UVE+6nsQGbK3e34qgPjlvTm/Mjt61vlC9yscs=; b=MiRJVASB2F2ZHxq6a7gA9hHv8492YApDl5Zr7mMjowQ4DLOawsP5tPmu W9VETyzhVJqlx3LiZv/RumUHzHH74IEa8ChqHhayX2qPfavf4cvVUalJL wzgXGyn/kN3KHpCOruT1HyWJPwtwd2+Ox/raKIdjNPhM/SFHPcKZ8Xrdz 5kmbHoMDKaGAQfhoBaaxtGwKj3zIRWFTluqH8WL2dyrjbEOJk+eowPVeo peRvDUKAYFVhqHLgZjS6v/GnVAicOR+xnNJ8KwdMdTKvST87HnthT2dC7 fSFoFrmmSH9LWrOHJAH7wQ9uCmHTKoW2eu9J9a8h+xbroKgof7pAjqyjj g==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="259048720" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="259048720" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="652001295" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga004.fm.intel.com with ESMTP; 14 Jun 2022 05:02:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id ACA37710; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 06/14] efi/x86: Implement support for unaccepted memory Date: Tue, 14 Jun 2022 15:02:23 +0300 Message-Id: <20220614120231.48165-7-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208164; a=rsa-sha256; cv=none; b=PENWFE0r/FyWw+YiCUMXys4DN2BT/dq52/PSpzaGlLWKaLimNxwTWSOBFI7Nf0++FNeHIn p6tRve4Z1NErmzlJ73xq0UQ77TkvhN81O+1xnWtKg8QDy4eRLLnDbo5pB+wOkQSIpDmqW6 zYi9e3Y5PKrHbwdbC7qoNKInXPpgeAk= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=MiRJVASB; spf=none (imf24.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.151) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208164; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CYfCHqkBhq5NDnT9kReRuHgqQnhcYZKizuhsCXExUeI=; b=1rSAC4rtnwwkDO3+HCzFAUtwq4txyxAyl7CZLTBLIyoAiTSNR95BvUvMOgRTc9NNZHLqI5 puAmJENGGa269Zuj3s/1eMXQlsXwm1SymXfktA2vzy/fDhz5jivMzEFOezOQTZFD6EE4Jt v7UFCmMuCafBBmt8DrqNLsP9ARe+ic0= X-Stat-Signature: oatydu8winpfyf4pecybq1ygeiqqggfu X-Rspamd-Queue-Id: 48A3818008D Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=MiRJVASB; spf=none (imf24.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.151) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspamd-Server: rspam07 X-Rspam-User: X-HE-Tag: 1655208164-925440 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: UEFI Specification version 2.9 introduces the concept of memory acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, requiring memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific for the Virtual Machine platform. Accepting memory is costly and it makes VMM allocate memory for the accepted guest physical address range. It's better to postpone memory acceptance until memory is needed. It lowers boot time and reduces memory overhead. The kernel needs to know what memory has been accepted. Firmware communicates this information via memory map: a new memory type -- EFI_UNACCEPTED_MEMORY -- indicates such memory. Range-based tracking works fine for firmware, but it gets bulky for the kernel: e820 has to be modified on every page acceptance. It leads to table fragmentation, but there's a limited number of entries in the e820 table Another option is to mark such memory as usable in e820 and track if the range has been accepted in a bitmap. One bit in the bitmap represents 2MiB in the address space: one 4k page is enough to track 64GiB or physical address space. In the worst-case scenario -- a huge hole in the middle of the address space -- It needs 256MiB to handle 4PiB of the address space. Any unaccepted memory that is not aligned to 2M gets accepted upfront. The bitmap is allocated and constructed in the EFI stub and passed down to the kernel via boot_params. allocate_e820() allocates the bitmap if unaccepted memory is present, according to the maximum address in the memory map. The same boot_params.unaccepted_memory can be used to pass the bitmap between two kernels on kexec, but the use-case is not yet implemented. The implementation requires some basic helpers in boot stub. They provided by linux/ includes in the main kernel image, but is not present in boot stub. Create copy of required functionality in the boot stub. Signed-off-by: Kirill A. Shutemov --- Documentation/x86/zero-page.rst | 1 + arch/x86/boot/compressed/Makefile | 1 + arch/x86/boot/compressed/mem.c | 68 ++++++++++++++++++++++++ arch/x86/include/asm/unaccepted_memory.h | 10 ++++ arch/x86/include/uapi/asm/bootparam.h | 2 +- drivers/firmware/efi/Kconfig | 14 +++++ drivers/firmware/efi/efi.c | 1 + drivers/firmware/efi/libstub/x86-stub.c | 68 ++++++++++++++++++++++++ include/linux/efi.h | 3 +- 9 files changed, 166 insertions(+), 2 deletions(-) create mode 100644 arch/x86/boot/compressed/mem.c create mode 100644 arch/x86/include/asm/unaccepted_memory.h diff --git a/Documentation/x86/zero-page.rst b/Documentation/x86/zero-page.rst index 45aa9cceb4f1..f21905e61ade 100644 --- a/Documentation/x86/zero-page.rst +++ b/Documentation/x86/zero-page.rst @@ -20,6 +20,7 @@ Offset/Size Proto Name Meaning 060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information (struct ist_info) 070/008 ALL acpi_rsdp_addr Physical address of ACPI RSDP table +078/008 ALL unaccepted_memory Bitmap of unaccepted memory (1bit == 2M) 080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!! 090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!! 0A0/010 ALL sys_desc_table System description table (struct sys_desc_table), diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 19e1905dcbf6..67732478323f 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -102,6 +102,7 @@ endif vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdcall.o +vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/bitmap.o $(obj)/mem.o vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c new file mode 100644 index 000000000000..415df0d3bc81 --- /dev/null +++ b/arch/x86/boot/compressed/mem.c @@ -0,0 +1,68 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "../cpuflags.h" +#include "bitmap.h" +#include "error.h" +#include "math.h" + +#define PMD_SHIFT 21 +#define PMD_SIZE (_AC(1, UL) << PMD_SHIFT) +#define PMD_MASK (~(PMD_SIZE - 1)) + +static inline void __accept_memory(phys_addr_t start, phys_addr_t end) +{ + /* Platform-specific memory-acceptance call goes here */ + error("Cannot accept memory"); +} + +/* + * The accepted memory bitmap only works at PMD_SIZE granularity. If a request + * comes in to mark memory as unaccepted which is not PMD_SIZE-aligned, simply + * accept the memory now since it can not be *marked* as unaccepted. + */ +void process_unaccepted_memory(struct boot_params *params, u64 start, u64 end) +{ + /* + * Accept small regions that might not be able to be represented + * in the bitmap. This is a bit imprecise and may accept some + * areas that could have been represented in the bitmap instead. + * + * Consider case like this: + * + * | 4k | 2044k | 2048k | + * ^ 0x0 ^ 2MB ^ 4MB + * + * all memory in the range is unaccepted, except for the first 4k. + * The second 2M can be represented in the bitmap, but kernel accept it + * right away. The imprecision makes the code simpler by ensuring that + * at least one bit will be set int the bitmap below. + */ + if (end - start < 2 * PMD_SIZE) { + __accept_memory(start, end); + return; + } + + /* + * No matter how the start and end are aligned, at least one unaccepted + * PMD_SIZE area will remain. + */ + + /* Immediately accept a unaccepted_memory, + start / PMD_SIZE, (end - start) / PMD_SIZE); +} diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/asm/unaccepted_memory.h new file mode 100644 index 000000000000..df0736d32858 --- /dev/null +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (C) 2020 Intel Corporation */ +#ifndef _ASM_X86_UNACCEPTED_MEMORY_H +#define _ASM_X86_UNACCEPTED_MEMORY_H + +struct boot_params; + +void process_unaccepted_memory(struct boot_params *params, u64 start, u64 num); + +#endif diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h index bea5cdcdf532..c62f6ee4b6f0 100644 --- a/arch/x86/include/uapi/asm/bootparam.h +++ b/arch/x86/include/uapi/asm/bootparam.h @@ -180,7 +180,7 @@ struct boot_params { __u64 tboot_addr; /* 0x058 */ struct ist_info ist_info; /* 0x060 */ __u64 acpi_rsdp_addr; /* 0x070 */ - __u8 _pad3[8]; /* 0x078 */ + __u64 unaccepted_memory; /* 0x078 */ __u8 hd0_info[16]; /* obsolete! */ /* 0x080 */ __u8 hd1_info[16]; /* obsolete! */ /* 0x090 */ struct sys_desc_table sys_desc_table; /* obsolete! */ /* 0x0a0 */ diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig index 7aa4717cdcac..e1270beff4dc 100644 --- a/drivers/firmware/efi/Kconfig +++ b/drivers/firmware/efi/Kconfig @@ -305,6 +305,20 @@ config EFI_COCO_SECRET virt/coco/efi_secret module to access the secrets, which in turn allows userspace programs to access the injected secrets. +config UNACCEPTED_MEMORY + bool + depends on EFI_STUB + help + Some Virtual Machine platforms, such as Intel TDX, require + some memory to be "accepted" by the guest before it can be used. + This mechanism helps prevent malicious hosts from making changes + to guest memory. + + UEFI specification v2.9 introduced EFI_UNACCEPTED_MEMORY memory type. + + This option adds support for unaccepted memory and makes such memory + usable by the kernel. + config EFI_EMBEDDED_FIRMWARE bool select CRYPTO_LIB_SHA256 diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 860534bcfdac..34d1ca870420 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -758,6 +758,7 @@ static __initdata char memory_type_name[][13] = { "MMIO Port", "PAL Code", "Persistent", + "Unaccepted", }; char * __init efi_md_typeattr_format(char *buf, size_t size, diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index 504955368934..b91c89100b2d 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "efistub.h" @@ -607,6 +608,17 @@ setup_e820(struct boot_params *params, struct setup_data *e820ext, u32 e820ext_s e820_type = E820_TYPE_PMEM; break; + case EFI_UNACCEPTED_MEMORY: + if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) { + efi_warn_once("The system has unaccepted memory," + " but kernel does not support it\n"); + efi_warn_once("Consider enabling CONFIG_UNACCEPTED_MEMORY\n"); + continue; + } + e820_type = E820_TYPE_RAM; + process_unaccepted_memory(params, d->phys_addr, + d->phys_addr + PAGE_SIZE * d->num_pages); + break; default: continue; } @@ -671,6 +683,59 @@ static efi_status_t alloc_e820ext(u32 nr_desc, struct setup_data **e820ext, return status; } +static efi_status_t allocate_unaccepted_memory(struct boot_params *params, + __u32 nr_desc, + struct efi_boot_memmap *map) +{ + unsigned long *mem = NULL; + u64 size, max_addr = 0; + efi_status_t status; + bool found = false; + int i; + + /* Check if there's any unaccepted memory and find the max address */ + for (i = 0; i < nr_desc; i++) { + efi_memory_desc_t *d; + + d = efi_early_memdesc_ptr(*map->map, *map->desc_size, i); + if (d->type == EFI_UNACCEPTED_MEMORY) + found = true; + if (d->phys_addr + d->num_pages * PAGE_SIZE > max_addr) + max_addr = d->phys_addr + d->num_pages * PAGE_SIZE; + } + + if (!found) { + params->unaccepted_memory = 0; + return EFI_SUCCESS; + } + + /* + * If unaccepted memory is present allocate a bitmap to track what + * memory has to be accepted before access. + * + * One bit in the bitmap represents 2MiB in the address space: + * A 4k bitmap can track 64GiB of physical address space. + * + * In the worst case scenario -- a huge hole in the middle of the + * address space -- It needs 256MiB to handle 4PiB of the address + * space. + * + * TODO: handle situation if params->unaccepted_memory is already set. + * It's required to deal with kexec. + * + * The bitmap will be populated in setup_e820() according to the memory + * map after efi_exit_boot_services(). + */ + size = DIV_ROUND_UP(max_addr, PMD_SIZE * BITS_PER_BYTE); + status = efi_allocate_pages(size, (unsigned long *)&mem, ULONG_MAX); + if (status == EFI_SUCCESS) { + memset(mem, 0, size); + params->unaccepted_memory = (unsigned long)mem; + } + + return status; +} + static efi_status_t allocate_e820(struct boot_params *params, struct efi_boot_memmap *map, struct setup_data **e820ext, @@ -691,6 +756,9 @@ static efi_status_t allocate_e820(struct boot_params *params, status = alloc_e820ext(nr_e820ext, e820ext, e820ext_size); } + if (IS_ENABLED(CONFIG_UNACCEPTED_MEMORY) && status == EFI_SUCCESS) + status = allocate_unaccepted_memory(params, nr_desc, map); + efi_bs_call(free_pool, *map->map); return status; } diff --git a/include/linux/efi.h b/include/linux/efi.h index 7d9b0bb47eb3..9c2fa94f2f93 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -108,7 +108,8 @@ typedef struct { #define EFI_MEMORY_MAPPED_IO_PORT_SPACE 12 #define EFI_PAL_CODE 13 #define EFI_PERSISTENT_MEMORY 14 -#define EFI_MAX_MEMORY_TYPE 15 +#define EFI_UNACCEPTED_MEMORY 15 +#define EFI_MAX_MEMORY_TYPE 16 /* Attribute values: */ #define EFI_MEMORY_UC ((u64)0x0000000000000001ULL) /* uncached */ From patchwork Tue Jun 14 12:02:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880966 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55622C43334 for ; Tue, 14 Jun 2022 12:03:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 46E318D0246; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 37ED48D0245; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EACBE8D0244; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C1F788D0246 for ; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9AC6735472 for ; Tue, 14 Jun 2022 12:02:45 +0000 (UTC) X-FDA: 79576704690.18.9523467 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf19.hostedemail.com (Postfix) with ESMTP id BB12F1A009F for ; Tue, 14 Jun 2022 12:02:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208164; x=1686744164; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CodHwoZ2rLYrhqQ6ze00eaB7F0xVPSzWsLToYorZfcA=; b=d17+Mfrx6n4KlkA/VSX+6UU1nRifj9IWR0dhK05M93GQhVy6Vh8WAXmQ MJI3MyQ5K1venOq12I3mQSsAxcb6IPH9PMYXxJHINzx5hR1cTOLacWfCg aIqdWIOwWL7xtbcjmrKR14vtFvza2H2QPVu3FaYMAmjVzTOLvor59R1hw XZXViOaOOTA7Rq873WyGdUARsJMpK/sJkzhIImtWfbHLtyjD0IGBM2gkP R7e2pX+GIGCQ8Jr+G61CSfN9KCyUR+I8CgrPN+f0LN/HNUOrw+yrCZxb9 P3nJLpDQ6UiiBJ9PZH1mMJFPz8Xau9EciP+gtjG5UrJbAiwUf6CHXcO/s Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="267281684" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="267281684" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="761936303" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga005.jf.intel.com with ESMTP; 14 Jun 2022 05:02:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id B8724793; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 07/14] x86/boot/compressed: Handle unaccepted memory Date: Tue, 14 Jun 2022 15:02:24 +0300 Message-Id: <20220614120231.48165-8-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208165; a=rsa-sha256; cv=none; b=Ne00Pif/qyaI9p22DwboNd5XAG/dgr+2nnRKtXaIegxYns0lI8r/DKt8qtfqSR6ZcehQIn /epLBB7/knJRGCd8XH9cj/U7uGQemeCS9TB9RVpxzDdJ2KMG6vKfBj/dNVyGW8tZo+vQCB qNMU2Zy+9e181vRAlpwpNjCwfYCNCnQ= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=d17+Mfrx; spf=none (imf19.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208165; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Qq/XtnbrPyAehpX7yqYgTIYlYA5Bvnc7iGnBnwzz/ek=; b=yRUmnkolW0KJ2x4BKPZ9Gz/bM91UC2SHvR3DA5IjdHW2FA35jqs/gdk9+UzfOEV02/lMvz Rdu82syoNXZF/Naw6V2i/719csC6Nixcaz3YIgiF9RqtlxQPXSvEpp+8JKRHHEXhp0rS9R 3mMkpDZef2uE/t02FlhTWbEIbZywtKI= X-Stat-Signature: nzrge4ay9mbbgy9td6wpepfi6dq3myhg X-Rspamd-Queue-Id: BB12F1A009F Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=d17+Mfrx; spf=none (imf19.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1655208164-376275 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The firmware will pre-accept the memory used to run the stub. But, the stub is responsible for accepting the memory into which it decompresses the main kernel. Accept memory just before decompression starts. The stub is also responsible for choosing a physical address in which to place the decompressed kernel image. The KASLR mechanism will randomize this physical address. Since the unaccepted memory region is relatively small, KASLR would be quite ineffective if it only used the pre-accepted area (EFI_CONVENTIONAL_MEMORY). Ensure that KASLR randomizes among the entire physical address space by also including EFI_UNACCEPTED_MEMORY. Signed-off-by: Kirill A. Shutemov --- arch/x86/boot/compressed/Makefile | 2 +- arch/x86/boot/compressed/efi.h | 1 + arch/x86/boot/compressed/kaslr.c | 35 ++++++++++++++++-------- arch/x86/boot/compressed/mem.c | 18 ++++++++++++ arch/x86/boot/compressed/misc.c | 6 ++++ arch/x86/boot/compressed/misc.h | 6 ++++ arch/x86/include/asm/unaccepted_memory.h | 2 ++ 7 files changed, 57 insertions(+), 13 deletions(-) diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 67732478323f..85a631f5cdff 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -102,7 +102,7 @@ endif vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdcall.o -vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/bitmap.o $(obj)/mem.o +vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/bitmap.o $(obj)/find.o $(obj)/mem.o vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o diff --git a/arch/x86/boot/compressed/efi.h b/arch/x86/boot/compressed/efi.h index 7db2f41b54cd..cf475243b6d5 100644 --- a/arch/x86/boot/compressed/efi.h +++ b/arch/x86/boot/compressed/efi.h @@ -32,6 +32,7 @@ typedef struct { } efi_table_hdr_t; #define EFI_CONVENTIONAL_MEMORY 7 +#define EFI_UNACCEPTED_MEMORY 15 #define EFI_MEMORY_MORE_RELIABLE \ ((u64)0x0000000000010000ULL) /* higher reliability */ diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c index 4a3f223973f4..a42264b0b2f5 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -671,6 +671,28 @@ static bool process_mem_region(struct mem_vector *region, } #ifdef CONFIG_EFI + +/* + * Only EFI_CONVENTIONAL_MEMORY and EFI_UNACCEPTED_MEMORY (if supported) are + * guaranteed to be free. + * + * It is more conservative in picking free memory than the EFI spec allows: + * + * According to the spec, EFI_BOOT_SERVICES_{CODE|DATA} are also free memory + * and thus available to place the kernel image into, but in practice there's + * firmware where using that memory leads to crashes. + */ +static inline bool memory_type_is_free(efi_memory_desc_t *md) +{ + if (md->type == EFI_CONVENTIONAL_MEMORY) + return true; + + if (md->type == EFI_UNACCEPTED_MEMORY) + return IS_ENABLED(CONFIG_UNACCEPTED_MEMORY); + + return false; +} + /* * Returns true if we processed the EFI memmap, which we prefer over the E820 * table if it is available. @@ -715,18 +737,7 @@ process_efi_entries(unsigned long minimum, unsigned long image_size) for (i = 0; i < nr_desc; i++) { md = efi_early_memdesc_ptr(pmap, e->efi_memdesc_size, i); - /* - * Here we are more conservative in picking free memory than - * the EFI spec allows: - * - * According to the spec, EFI_BOOT_SERVICES_{CODE|DATA} are also - * free memory and thus available to place the kernel image into, - * but in practice there's firmware where using that memory leads - * to crashes. - * - * Only EFI_CONVENTIONAL_MEMORY is guaranteed to be free. - */ - if (md->type != EFI_CONVENTIONAL_MEMORY) + if (!memory_type_is_free(md)) continue; if (efi_soft_reserve_enabled() && diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c index 415df0d3bc81..b45458af00ca 100644 --- a/arch/x86/boot/compressed/mem.c +++ b/arch/x86/boot/compressed/mem.c @@ -3,12 +3,15 @@ #include "../cpuflags.h" #include "bitmap.h" #include "error.h" +#include "find.h" #include "math.h" #define PMD_SHIFT 21 #define PMD_SIZE (_AC(1, UL) << PMD_SHIFT) #define PMD_MASK (~(PMD_SIZE - 1)) +extern struct boot_params *boot_params; + static inline void __accept_memory(phys_addr_t start, phys_addr_t end) { /* Platform-specific memory-acceptance call goes here */ @@ -66,3 +69,18 @@ void process_unaccepted_memory(struct boot_params *params, u64 start, u64 end) bitmap_set((unsigned long *)params->unaccepted_memory, start / PMD_SIZE, (end - start) / PMD_SIZE); } + +void accept_memory(phys_addr_t start, phys_addr_t end) +{ + unsigned long range_start, range_end; + unsigned long *bitmap, bitmap_size; + + bitmap = (unsigned long *)boot_params->unaccepted_memory; + range_start = start / PMD_SIZE; + bitmap_size = DIV_ROUND_UP(end, PMD_SIZE); + + for_each_set_bitrange_from(range_start, range_end, bitmap, bitmap_size) { + __accept_memory(range_start * PMD_SIZE, range_end * PMD_SIZE); + bitmap_clear(bitmap, range_start, range_end - range_start); + } +} diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index cf690d8712f4..c41a87b0ec06 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -454,6 +454,12 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap, #endif debug_putstr("\nDecompressing Linux... "); + + if (boot_params->unaccepted_memory) { + debug_putstr("Accepting memory... "); + accept_memory(__pa(output), __pa(output) + needed_size); + } + __decompress(input_data, input_len, NULL, NULL, output, output_len, NULL, error); parse_elf(output); diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index 245cf8f2a0bd..da7ef45133b9 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -235,4 +235,10 @@ static inline unsigned long efi_find_vendor_table(struct boot_params *bp, } #endif /* CONFIG_EFI */ +#ifdef CONFIG_UNACCEPTED_MEMORY +void accept_memory(phys_addr_t start, phys_addr_t end); +#else +static inline void accept_memory(phys_addr_t start, phys_addr_t end) {} +#endif + #endif /* BOOT_COMPRESSED_MISC_H */ diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/asm/unaccepted_memory.h index df0736d32858..41fbfc798100 100644 --- a/arch/x86/include/asm/unaccepted_memory.h +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -7,4 +7,6 @@ struct boot_params; void process_unaccepted_memory(struct boot_params *params, u64 start, u64 num); +void accept_memory(phys_addr_t start, phys_addr_t end); + #endif From patchwork Tue Jun 14 12:02:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880968 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBDE9CCA47A for ; Tue, 14 Jun 2022 12:03:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B25EA8D0245; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A0EEB8D0247; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8ADAF8D0245; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5F9F38D0247 for ; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id 34D806177B for ; Tue, 14 Jun 2022 12:02:46 +0000 (UTC) X-FDA: 79576704732.16.3384EC2 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by imf01.hostedemail.com (Postfix) with ESMTP id 7654D400AE for ; Tue, 14 Jun 2022 12:02:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208165; x=1686744165; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Vt7QSmHfxcGQwi2CPob9vBQysbMDXN+N63yj2aSmEdc=; b=d0N7IGfdtvGeKFUkxdu/t+7A4wUXRu2gYcR/ye6mScH30Nvqfc3lwpya BcwPaMDogctiIVLwUVJBgdxvCR/eL5AK1+AMhhgcZXffrc9dyyMrCjJea 479xSd9auq2Yu/EDvwAmhtUj4CQJFxhDBzFS5i6ZtNvMxBhb5u6A0jd/v 4+fpek4CEdcsM8NnqCNPFclCWl/1v4tDFVQap8lDHqa/FNTAihZgBE8y2 X6OLqPr6w78bc/HNq2+F34pe4QJ6yxO8Fxx0la6ETXaECCumGF7kOQaAf vVKK/Kf0NWTH4EdUCGM6VYuxRz63e89vCOY1TLam7fXt0F0aLTCk8lhDL A==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="340260657" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="340260657" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="640332827" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga008.fm.intel.com with ESMTP; 14 Jun 2022 05:02:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id C01F088D; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Mike Rapoport Subject: [PATCHv7 08/14] x86/mm: Reserve unaccepted memory bitmap Date: Tue, 14 Jun 2022 15:02:25 +0300 Message-Id: <20220614120231.48165-9-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208165; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lC3G4n0zFqgOUruBrn2FFz4wqh1BEanTgj1uGK55NL0=; b=IxgHlqgX/53lBINsd3HhBcEAuOme5tALg9NDxMM23H1R/DiJIG6HzF+hrS4tMhcUeCnGfP ZijmV1hq4MBoOnQNT0H2nAt1KCspKjY3Z8PaRoUHeAk5J2GXCl+eqotq026w8SMuRUOaPG azOIyEmvm75zYXETqklfRaexqONbuSo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208165; a=rsa-sha256; cv=none; b=d9cL5CFVe1IgzfeSsuUvQrqW5fNM5irGllBi6BGKV7zfisiXo1vOt91uAf/+gRfKBk3rtK xqjIM18iN+aiRCQTXZljZmw43Jc3lN2eZdxo7Zaol1p8ByDs3mchNxCKJlq6DHgZBhTJQ2 Vja5FvfNDa2btOvZzGRQH9su00DD0oM= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=d0N7IGfd; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf01.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.31) smtp.mailfrom=kirill.shutemov@linux.intel.com Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=d0N7IGfd; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf01.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.31) smtp.mailfrom=kirill.shutemov@linux.intel.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: 4z7kwe3t6pcboa3kmwbssinczxbhrz5g X-Rspamd-Queue-Id: 7654D400AE X-HE-Tag: 1655208165-316268 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A given page of memory can only be accepted once. The kernel has to accept memory both in the early decompression stage and during normal runtime. A bitmap is used to communicate the acceptance state of each page between the decompression stage and normal runtime. boot_params is used to communicate location of the bitmap throughout the boot. The bitmap is allocated and initially populated in EFI stub. Decompression stage accepts pages required for kernel/initrd and marks these pages accordingly in the bitmap. The main kernel picks up the bitmap from the same boot_params and uses it to determine what has to be accepted on allocation. In the runtime kernel, reserve the bitmap's memory to ensure nothing overwrites it. The size of bitmap is determined with e820__end_of_ram_pfn() which relies on setup_e820() marking unaccepted memory as E820_TYPE_RAM. Signed-off-by: Kirill A. Shutemov Acked-by: Mike Rapoport --- arch/x86/kernel/e820.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index f267205f2d5a..22d1fe48dcba 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -1316,6 +1316,16 @@ void __init e820__memblock_setup(void) int i; u64 end; + /* Mark unaccepted memory bitmap reserved */ + if (boot_params.unaccepted_memory) { + unsigned long size; + + /* One bit per 2MB */ + size = DIV_ROUND_UP(e820__end_of_ram_pfn() * PAGE_SIZE, + PMD_SIZE * BITS_PER_BYTE); + memblock_reserve(boot_params.unaccepted_memory, size); + } + /* * The bootstrap memblock region count maximum is 128 entries * (INIT_MEMBLOCK_REGIONS), but EFI might pass us more E820 entries From patchwork Tue Jun 14 12:02:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880965 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A64AC43334 for ; Tue, 14 Jun 2022 12:03:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 09AF98D023D; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F19AA8D0247; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C0C1D8D023D; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8FB1D8D0242 for ; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6825E16A8 for ; Tue, 14 Jun 2022 12:02:45 +0000 (UTC) X-FDA: 79576704690.17.796E43F Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by imf01.hostedemail.com (Postfix) with ESMTP id 8B0AA400AE for ; Tue, 14 Jun 2022 12:02:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208164; x=1686744164; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4+cZiXkfCn9pd/1sYbCEjbMclwb37pFvdZ6baZwVkzM=; b=ZyB0CiBBmc/AhX2L6ZMBtcRXD72VD+AbSnO4O1g2dRwvUuLpn/o+SIVt li6v3gsIQ71r8A0oNUN+PMDNd15J0r9DlJCQzOSu64oVwWAla0pl28Cc9 hopXk5iwGGSkpJt4wCq/kirgabvjZS0MZydKFPK0x5NxI6Yvgke4s1Vfk tF/c7G4c0lM9vE0wTHBEDP4C39keGZi4ptpVaZuZcH1ag1pA3rx7b/vOB /UUMW62H+ZFSg3qIyOBVQTDIC5lwvm+BncBQFVIROPprUJbVJXqYqhMa2 9RpPpJoDb4LxDjGXP6tj9hDKWPe/VjuLogarFgkrJN8mU3DYbLh7vEesB Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="340260645" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="340260645" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="686633480" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga002.fm.intel.com with ESMTP; 14 Jun 2022 05:02:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id C571B4EB; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 09/14] x86/mm: Provide helpers for unaccepted memory Date: Tue, 14 Jun 2022 15:02:26 +0300 Message-Id: <20220614120231.48165-10-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208164; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=moklifrXT687O3bE/HV0TN8CPE9FCj861qdwV/m46QM=; b=k3GEyf7PMuX6uwjR8wNNLr1wAxcMYeMpiVc86ICEF2xhTcCO0aYXO7mHUuqovp9AHBXbpx g2yvkKfm7R2rWI39qtdkkos9NfaMSMEckilQ/EotAhSBg7qjF6wiqhlqlw+h8JbshBWVp5 lyDRuAdk/D2JRn/s9hYhznb1LJEo7Lw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208164; a=rsa-sha256; cv=none; b=Gb+L+YqSJVgup6RE4zdNQVVU9WIjWqGi8cxUs5gfTOpy/yCZakb47PlAOwXWBqyiqvl/sh jALVbkErgpTLgE6kigw4QD/SSgU90a4AoHx5BCNVxbAdw01EXyzaSm3XdlHyb+ARnlkKCC nPk2uB1gYgeGRrbPSUm/1a+uXIzLXm0= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ZyB0CiBB; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf01.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.31) smtp.mailfrom=kirill.shutemov@linux.intel.com Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ZyB0CiBB; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf01.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.31) smtp.mailfrom=kirill.shutemov@linux.intel.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: 9djmyuxi9mo9jtz5h5hsxh75mysps5xp X-Rspamd-Queue-Id: 8B0AA400AE X-HE-Tag: 1655208164-448720 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Core-mm requires few helpers to support unaccepted memory: - accept_memory() checks the range of addresses against the bitmap and accept memory if needed. - range_contains_unaccepted_memory() checks if anything within the range requires acceptance. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/page.h | 3 ++ arch/x86/include/asm/unaccepted_memory.h | 4 ++ arch/x86/mm/Makefile | 2 + arch/x86/mm/unaccepted_memory.c | 61 ++++++++++++++++++++++++ 4 files changed, 70 insertions(+) create mode 100644 arch/x86/mm/unaccepted_memory.c diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h index 9cc82f305f4b..df4ec3a988dc 100644 --- a/arch/x86/include/asm/page.h +++ b/arch/x86/include/asm/page.h @@ -19,6 +19,9 @@ struct page; #include + +#include + extern struct range pfn_mapped[]; extern int nr_pfn_mapped; diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/asm/unaccepted_memory.h index 41fbfc798100..89fc91c61560 100644 --- a/arch/x86/include/asm/unaccepted_memory.h +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -7,6 +7,10 @@ struct boot_params; void process_unaccepted_memory(struct boot_params *params, u64 start, u64 num); +#ifdef CONFIG_UNACCEPTED_MEMORY + void accept_memory(phys_addr_t start, phys_addr_t end); +bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end); #endif +#endif diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index f8220fd2c169..96fc2766b6d7 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -59,3 +59,5 @@ obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_amd.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_identity.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o + +obj-$(CONFIG_UNACCEPTED_MEMORY) += unaccepted_memory.o diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c new file mode 100644 index 000000000000..1df918b21469 --- /dev/null +++ b/arch/x86/mm/unaccepted_memory.c @@ -0,0 +1,61 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include + +#include +#include +#include + +/* Protects unaccepted memory bitmap */ +static DEFINE_SPINLOCK(unaccepted_memory_lock); + +void accept_memory(phys_addr_t start, phys_addr_t end) +{ + unsigned long range_start, range_end; + unsigned long *bitmap; + unsigned long flags; + + if (!boot_params.unaccepted_memory) + return; + + bitmap = __va(boot_params.unaccepted_memory); + range_start = start / PMD_SIZE; + + spin_lock_irqsave(&unaccepted_memory_lock, flags); + for_each_set_bitrange_from(range_start, range_end, bitmap, + DIV_ROUND_UP(end, PMD_SIZE)) { + unsigned long len = range_end - range_start; + + /* Platform-specific memory-acceptance call goes here */ + panic("Cannot accept memory: unknown platform\n"); + bitmap_clear(bitmap, range_start, len); + } + spin_unlock_irqrestore(&unaccepted_memory_lock, flags); +} + +bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) +{ + unsigned long *bitmap; + unsigned long flags; + bool ret = false; + + if (!boot_params.unaccepted_memory) + return 0; + + bitmap = __va(boot_params.unaccepted_memory); + + spin_lock_irqsave(&unaccepted_memory_lock, flags); + while (start < end) { + if (test_bit(start / PMD_SIZE, bitmap)) { + ret = true; + break; + } + + start += PMD_SIZE; + } + spin_unlock_irqrestore(&unaccepted_memory_lock, flags); + + return ret; +} From patchwork Tue Jun 14 12:02:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880963 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30E4AC433EF for ; Tue, 14 Jun 2022 12:02:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9CD138D0243; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 86A4B8D0244; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 562108D0243; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 396078D023D for ; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 0847935EF1 for ; Tue, 14 Jun 2022 12:02:45 +0000 (UTC) X-FDA: 79576704690.12.8431F61 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf02.hostedemail.com (Postfix) with ESMTP id 00547800A2 for ; Tue, 14 Jun 2022 12:02:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208164; x=1686744164; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bd6faJOVW2FfIjzDQn03hQW9BQt9XmcRmtaVqSAHsus=; b=VI/ZaUFEirvPHOeGDgj/T3vS1hOTnEG5htdXshXnabHXtmg3DuX4v+Yi 0DuWAN+/4PvyV7pvi4VRaZb3gqwXuWUoB+WimDgtBIMLtU3r19dz0Orne 7tQTaAgaPxDnOAOkxkkyxAZToWwzjHsbAXY3eaw8iXdpOjXWo0q5HNqvD y+eJchmBfr9TD28QFod4WKILWYl9WFyKGNZPAx0XxRAozHkWKrChtP0rd EnQcyWAwgZ8U4fYRK9AM8+CAJqp7uOs5iwtuMiYCZQe9h6mAADzUZLCR4 LrZCD1hWW3uH0hveANRRbA05fUNOO9CQmHi6D89+IrxsEUw+0+YRSdn8j w==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="276138240" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="276138240" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="535534975" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga003.jf.intel.com with ESMTP; 14 Jun 2022 05:02:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id D40B792D; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 10/14] x86/mm: Avoid load_unaligned_zeropad() stepping into unaccepted memory Date: Tue, 14 Jun 2022 15:02:27 +0300 Message-Id: <20220614120231.48165-11-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208164; a=rsa-sha256; cv=none; b=gonY1WNNlts5Otr6MruZmle8zEM//sBa6TpiD6NyBDmEH3IwRI3Sbfyf6tZ2KT3tzMMXUH iLWxuqI5MhVz9v+KXa81+3HUo5VoI8gan0uP0IB+tdKiyzBURHGQ9yJTTBUtBFwuehiGAZ UcT6cG7qn7Dh5pujGvjWSnITPhJlaJQ= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="VI/ZaUFE"; spf=none (imf02.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208164; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kmfRzBMfXCTLbCAYkUrsy7+bxCVXQn86pog+30PajRc=; b=5A8W+qG14ZIYwCZVPA0faSY/h278/zCn9maFhL5nfFHvy3in9Ia2kwD6N/TAUucRPjWa29 YC9452lKGTX3tqCTL+qjcMKlMTSxfDMVirnWFP5VDE1U8HwMHUifkLnMCTy0P/R/ZWkTSE q5uqwUYj31oG16ImnkS7SABBrRBMKSI= X-Stat-Signature: whhrjm9i7wrfaias74pqjyue3dip6gyb X-Rspamd-Queue-Id: 00547800A2 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="VI/ZaUFE"; spf=none (imf02.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1655208163-631350 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: load_unaligned_zeropad() can lead to unwanted loads across page boundaries. The unwanted loads are typically harmless. But, they might be made to totally unrelated or even unmapped memory. load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now #VE) to recover from these unwanted loads. But, this approach does not work for unaccepted memory. For TDX, a load from unaccepted memory will not lead to a recoverable exception within the guest. The guest will exit to the VMM where the only recourse is to terminate the guest. There are three parts to fix this issue and comprehensively avoid access to unaccepted memory. Together these ensure that an extra “guard” page is accepted in addition to the memory that needs to be used. 1. Implicitly extend the range_contains_unaccepted_memory(start, end) checks up to end+2M if ‘end’ is aligned on a 2M boundary. 2. Implicitly extend accept_memory(start, end) to end+2M if ‘end’ is aligned on a 2M boundary. 3. Set PageUnaccepted() on both memory that itself needs to be accepted *and* memory where the next page needs to be accepted. Essentially, make PageUnaccepted(page) a marker for whether work needs to be done to make ‘page’ usable. That work might include accepting pages in addition to ‘page’ itself. Side note: This leads to something strange. Pages which were accepted at boot, marked by the firmware as accepted and will never _need_ to be accepted might have PageUnaccepted() set on them. PageUnaccepted(page) is a cue to ensure that the next page is accepted before ‘page’ can be used. This is an actual, real-world problem which was discovered during TDX testing. Signed-off-by: Kirill A. Shutemov Reviewed-by: Dave Hansen --- arch/x86/mm/unaccepted_memory.c | 36 +++++++++++++++++++++++++ drivers/firmware/efi/libstub/x86-stub.c | 7 +++++ 2 files changed, 43 insertions(+) diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c index 1df918b21469..bcd56fe82b9e 100644 --- a/arch/x86/mm/unaccepted_memory.c +++ b/arch/x86/mm/unaccepted_memory.c @@ -23,6 +23,38 @@ void accept_memory(phys_addr_t start, phys_addr_t end) bitmap = __va(boot_params.unaccepted_memory); range_start = start / PMD_SIZE; + /* + * load_unaligned_zeropad() can lead to unwanted loads across page + * boundaries. The unwanted loads are typically harmless. But, they + * might be made to totally unrelated or even unmapped memory. + * load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now + * #VE) to recover from these unwanted loads. + * + * But, this approach does not work for unaccepted memory. For TDX, a + * load from unaccepted memory will not lead to a recoverable exception + * within the guest. The guest will exit to the VMM where the only + * recourse is to terminate the guest. + * + * There are three parts to fix this issue and comprehensively avoid + * access to unaccepted memory. Together these ensure that an extra + * “guard” page is accepted in addition to the memory that needs to be + * used: + * + * 1. Implicitly extend the range_contains_unaccepted_memory(start, end) + * checks up to end+2M if ‘end’ is aligned on a 2M boundary. + * + * 2. Implicitly extend accept_memory(start, end) to end+2M if ‘end’ is + * aligned on a 2M boundary. + * + * 3. Set PageUnaccepted() on both memory that itself needs to be + * accepted *and* memory where the next page needs to be accepted. + * Essentially, make PageUnaccepted(page) a marker for whether work + * needs to be done to make ‘page’ usable. That work might include + * accepting pages in addition to ‘page’ itself. + */ + if (!(end % PMD_SIZE)) + end += PMD_SIZE; + spin_lock_irqsave(&unaccepted_memory_lock, flags); for_each_set_bitrange_from(range_start, range_end, bitmap, DIV_ROUND_UP(end, PMD_SIZE)) { @@ -46,6 +78,10 @@ bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) bitmap = __va(boot_params.unaccepted_memory); + /* See comment on load_unaligned_zeropad() in accept_memory() */ + if (!(end % PMD_SIZE)) + end += PMD_SIZE; + spin_lock_irqsave(&unaccepted_memory_lock, flags); while (start < end) { if (test_bit(start / PMD_SIZE, bitmap)) { diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index b91c89100b2d..bc1110509de4 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -709,6 +709,13 @@ static efi_status_t allocate_unaccepted_memory(struct boot_params *params, return EFI_SUCCESS; } + /* + * range_contains_unaccepted_memory() may need to check one 2M chunk + * beyond the end of RAM to deal with load_unaligned_zeropad(). Make + * sure that the bitmap is large enough handle it. + */ + max_addr += PMD_SIZE; + /* * If unaccepted memory is present allocate a bitmap to track what * memory has to be accepted before access. From patchwork Tue Jun 14 12:02:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880961 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C97A8C433EF for ; Tue, 14 Jun 2022 12:02:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D94878D0240; Tue, 14 Jun 2022 08:02:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D44608D023D; Tue, 14 Jun 2022 08:02:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B95F98D0240; Tue, 14 Jun 2022 08:02:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id AC1FF8D023D for ; Tue, 14 Jun 2022 08:02:44 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 86AFA21182 for ; Tue, 14 Jun 2022 12:02:44 +0000 (UTC) X-FDA: 79576704648.22.8232DD1 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf22.hostedemail.com (Postfix) with ESMTP id BEB10C00B6 for ; Tue, 14 Jun 2022 12:02:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208163; x=1686744163; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5iNOOTJdL5ijo85SJxVFVHCZr4Xdf/z52sXS4dv1fJU=; b=GPjFgFOYMWyzEPeCBZomWjCIWOPZY522Dv6T3A92NE3e1aojWdYyCG1w JW048DJK2qH7U1NqIyT2v4/ln2nOihTRGuTDX+AaP7RwkJ+1uK1goMbwJ n2xIP/FKh4XOR5VbgPpJBKIN573jDjAmuIadD/qhcPPmgA3J7S5KnuQKW Go/d6YGRNtrJlEEqr9/zGko+AHssxlmPY9bYxfA4+mh9ephilIZAXevD2 LslzqRgWmmmLD3OjlRs1dFT5L7yYNU3s8YHtomO/urnD/H59AmO74lGXc B+naq8J3fxpx0jmRv9LZ2EJMBWWM+uDj5Voe1JXbmF56seFMcsMrQhcXc A==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="267281679" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="267281679" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="558311426" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga006.jf.intel.com with ESMTP; 14 Jun 2022 05:02:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id DA1A7911; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 11/14] x86: Disable kexec if system has unaccepted memory Date: Tue, 14 Jun 2022 15:02:28 +0300 Message-Id: <20220614120231.48165-12-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208164; a=rsa-sha256; cv=none; b=bOmJcLTnIaseoHvEX4EmmAJGoghoYKCEfty2+9UXhSBAp1VZ0dz2wugLiift2A16pnsbLe qdOi6GqFO8Im+dG0jE6Huk/s6bF9vEU3/9F1y4NswVXjFlHKPLeSEcceCLr+Mve0w5SsHr QO46mTWsJOjpwMmhJ+PWnSFtykUSlFc= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=GPjFgFOY; spf=none (imf22.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208164; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=T83ZuDv6C39ecNEbaEJCyi/QMOfVe0PDc9sBvHINSvQ=; b=l42IFO/QqHoPmjU0beFd3nEMpQAAczFNW4TzmJY+KZhUhhjNRXD1Mf4QtNwtVTXXFWZCo/ sv57ZxKgZvu2tOYB6LPLSc/3Guf1NNWM1d94jrALm3vIMyWWoDTsIMuG+9c7Jkc0JsDgQb WFX1VC6SnHWsnZVqhP9Rz0awZgekD1w= X-Stat-Signature: ycpn3xg4zqpgtknnkp1kb1hswd6f4r7t X-Rspamd-Queue-Id: BEB10C00B6 Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=GPjFgFOY; spf=none (imf22.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1655208163-287371 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On kexec, the target kernel has to know what memory has been accepted. Information in EFI map is out of date and cannot be used. boot_params.unaccepted_memory can be used to pass the bitmap between two kernels on kexec, but the use-case is not yet implemented. Disable kexec on machines with unaccepted memory for now. Signed-off-by: Kirill A. Shutemov Signed-off-by: Kirill A. Shutemov --- arch/x86/mm/unaccepted_memory.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c index bcd56fe82b9e..05e216716690 100644 --- a/arch/x86/mm/unaccepted_memory.c +++ b/arch/x86/mm/unaccepted_memory.c @@ -1,4 +1,5 @@ // SPDX-License-Identifier: GPL-2.0-only +#include #include #include #include @@ -95,3 +96,21 @@ bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) return ret; } + +static int __init unaccepted_init(void) +{ + if (!boot_params.unaccepted_memory) + return 0; + +#ifdef CONFIG_KEXEC_CORE + /* + * TODO: Information on memory acceptance status has to be communicated + * between kernel. + */ + pr_warn("Disable kexec: not yet supported on systems with unaccepted memory\n"); + kexec_load_disabled = 1; +#endif + + return 0; +} +fs_initcall(unaccepted_init); From patchwork Tue Jun 14 12:02:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880967 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEF2ACCA47E for ; Tue, 14 Jun 2022 12:03:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7019E8D0244; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B2CF8D0248; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 46BE28D0244; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id F21A78D0248 for ; Tue, 14 Jun 2022 08:02:45 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id B2B376191F for ; Tue, 14 Jun 2022 12:02:45 +0000 (UTC) X-FDA: 79576704690.05.BC1F1E9 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf03.hostedemail.com (Postfix) with ESMTP id F2F30200A4 for ; Tue, 14 Jun 2022 12:02:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208165; x=1686744165; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ifzU/1O3tKgZxe0qDTIUj1kc7Q7SWbzosWaglEBdNsA=; b=bZ6Oaqa3tS+48XfejNGIWdkB1UsP6git8YFt9AdGcUhAujebGie9qduY H5x0/AGWCDcIaqJE2LHYTgnpIVZXQcGSeXms+tUPpqWBvmQM0ZFqYJt9l GlNV7QtOOYMpFC8ADnP53JowanTvUXYKf/4m5c3kumyrnqEBl/yZdu8X6 Cgfh1JHXfjGUJL3f2W3oR2NwpVVTfuwxxJw10XeN5bjlvy+Jq9M0A2/kf vSdOAeCYNRh4qMlkTw6DNw/ihzDItzfj4StI4IZrBXj5WhynbUijN5vZL wRqQcqpgKo8yJguq951XoFRGCCUrKfN7ZJV50mCBj3RsIsKz1D/1U77Lc g==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="258432265" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="258432265" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="673838645" Received: from black.fi.intel.com ([10.237.72.28]) by FMSMGA003.fm.intel.com with ESMTP; 14 Jun 2022 05:02:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id EB0E69E6; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 12/14] x86/tdx: Make _tdx_hypercall() and __tdx_module_call() available in boot stub Date: Tue, 14 Jun 2022 15:02:29 +0300 Message-Id: <20220614120231.48165-13-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208165; a=rsa-sha256; cv=none; b=iTUlvFHt+KezSmLInIahco9HHL/wI90NByXXn+rfnI3wfXTymwB5X6X18MwgeDboZq5wMl ylJTxbE25WNhoxq5f1f+StzzLK3XhB6e6fgJBn9pHWffbVLTcVRKT43nl3P8GJ9V0NTvHE 1XUNjkIEQ8bBHVJtNO9Z8BRY9OYnSZA= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=bZ6Oaqa3; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf03.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.136) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208165; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zp0SeVhEU3EBvvTIbT1ePyHUSPp7/qvssHmpI81vUPo=; b=8IByUvgBDpgk/ugHopJWHFK8qxkdTEdv7ZNuhvEf+JwzDTTAwv6PcXvzB/sPBQwbMFmEN/ JV6++UK4jIaDgA/QHzCD6nXxwjdTclkTmqDhdT0cKzVzHORg+TJnzyJHSogidmC9LygKXe 5QlrWLjkH13Rj6MalcZ4Va1AvMFWl8w= X-Rspamd-Queue-Id: F2F30200A4 X-Rspam-User: Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=bZ6Oaqa3; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf03.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.136) smtp.mailfrom=kirill.shutemov@linux.intel.com X-Rspamd-Server: rspam06 X-Stat-Signature: gcabazfmj899gaqc16g3zacgruggegaj X-HE-Tag: 1655208164-448615 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Memory acceptance requires a hypercall and one or multiple module calls. Make helpers for the calls available in boot stub. It has to accept memory where kernel image and initrd are placed. Signed-off-by: Kirill A. Shutemov Reviewed-by: Dave Hansen --- arch/x86/coco/tdx/tdx.c | 26 ------------------ arch/x86/include/asm/shared/tdx.h | 45 +++++++++++++++++++++++++++++++ arch/x86/include/asm/tdx.h | 19 ------------- 3 files changed, 45 insertions(+), 45 deletions(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 3bcaf2170ede..b75fe670032b 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -12,14 +12,6 @@ #include #include -/* TDX module Call Leaf IDs */ -#define TDX_GET_INFO 1 -#define TDX_GET_VEINFO 3 -#define TDX_ACCEPT_PAGE 6 - -/* TDX hypercall Leaf IDs */ -#define TDVMCALL_MAP_GPA 0x10001 - /* MMIO direction */ #define EPT_READ 0 #define EPT_WRITE 1 @@ -34,24 +26,6 @@ #define VE_GET_PORT_NUM(e) ((e) >> 16) #define VE_IS_IO_STRING(e) ((e) & BIT(4)) -/* - * Wrapper for standard use of __tdx_hypercall with no output aside from - * return code. - */ -static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15) -{ - struct tdx_hypercall_args args = { - .r10 = TDX_HYPERCALL_STANDARD, - .r11 = fn, - .r12 = r12, - .r13 = r13, - .r14 = r14, - .r15 = r15, - }; - - return __tdx_hypercall(&args, 0); -} - /* Called from __tdx_hypercall() for unrecoverable failure */ void __tdx_hypercall_failed(void) { diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h index e53f26228fbb..956ced04c3be 100644 --- a/arch/x86/include/asm/shared/tdx.h +++ b/arch/x86/include/asm/shared/tdx.h @@ -13,6 +13,14 @@ #define TDX_CPUID_LEAF_ID 0x21 #define TDX_IDENT "IntelTDX " +/* TDX module Call Leaf IDs */ +#define TDX_GET_INFO 1 +#define TDX_GET_VEINFO 3 +#define TDX_ACCEPT_PAGE 6 + +/* TDX hypercall Leaf IDs */ +#define TDVMCALL_MAP_GPA 0x10001 + #ifndef __ASSEMBLY__ /* @@ -33,8 +41,45 @@ struct tdx_hypercall_args { /* Used to request services from the VMM */ u64 __tdx_hypercall(struct tdx_hypercall_args *args, unsigned long flags); +/* + * Wrapper for standard use of __tdx_hypercall with no output aside from + * return code. + */ +static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15) +{ + struct tdx_hypercall_args args = { + .r10 = TDX_HYPERCALL_STANDARD, + .r11 = fn, + .r12 = r12, + .r13 = r13, + .r14 = r14, + .r15 = r15, + }; + + return __tdx_hypercall(&args, 0); +} + + /* Called from __tdx_hypercall() for unrecoverable failure */ void __tdx_hypercall_failed(void); +/* + * Used in __tdx_module_call() to gather the output registers' values of the + * TDCALL instruction when requesting services from the TDX module. This is a + * software only structure and not part of the TDX module/VMM ABI + */ +struct tdx_module_output { + u64 rcx; + u64 rdx; + u64 r8; + u64 r9; + u64 r10; + u64 r11; +}; + +/* Used to communicate with the TDX module */ +u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, + struct tdx_module_output *out); + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_SHARED_TDX_H */ diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 020c81a7c729..d9106d3e89f8 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -20,21 +20,6 @@ #ifndef __ASSEMBLY__ -/* - * Used to gather the output registers values of the TDCALL and SEAMCALL - * instructions when requesting services from the TDX module. - * - * This is a software only structure and not part of the TDX module/VMM ABI. - */ -struct tdx_module_output { - u64 rcx; - u64 rdx; - u64 r8; - u64 r9; - u64 r10; - u64 r11; -}; - /* * Used by the #VE exception handler to gather the #VE exception * info from the TDX module. This is a software only structure @@ -55,10 +40,6 @@ struct ve_info { void __init tdx_early_init(void); -/* Used to communicate with the TDX module */ -u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, - struct tdx_module_output *out); - void tdx_get_ve_info(struct ve_info *ve); bool tdx_handle_virt_exception(struct pt_regs *regs, struct ve_info *ve); From patchwork Tue Jun 14 12:02:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880969 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB2AAC433EF for ; Tue, 14 Jun 2022 12:03:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0C19C8D0247; Tue, 14 Jun 2022 08:02:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0997B8D0248; Tue, 14 Jun 2022 08:02:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E7F848D0247; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C66898D0248 for ; Tue, 14 Jun 2022 08:02:46 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A80E436000 for ; Tue, 14 Jun 2022 12:02:46 +0000 (UTC) X-FDA: 79576704732.18.90288AB Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by imf21.hostedemail.com (Postfix) with ESMTP id E1B111C00B6 for ; Tue, 14 Jun 2022 12:02:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208166; x=1686744166; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HHc2v/294BxJnc5ewxUspD14zKyUa2q1jtot0Cq83W4=; b=isjgJUoUx6PI1bN/C7XG2b3uH80/fkx4V8O2WqGQSWy/QEvT2Xe1x6GC zYDseFvaAsAbYuVyOeieZtW5StUXkMJAGLyMtXK7R7iCgqh4WWxLQqdeO srIGgloGuhc0LnYJqolmxsSnBfg48rgujctrdLonCMK5ER98ek5XwxcSL sISI/Z+MvK8r85vMjD2fBh6FbMDJALIG2xMzjHAbGfUS9oteH7CGRnAqU ltF2PAq3DXpLaEhmyZfvKKA3A9gKNa4HUO7+sEfnEh3x+vZwnpsR5dUkf oy76LFcG6Jg9fjlNV8zkuJkcAJwb5RUBavykmWx6RnFbG+NdV5v8ZHH3v w==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="340260658" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="340260658" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="640332830" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga008.fm.intel.com with ESMTP; 14 Jun 2022 05:02:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 02C87A17; Tue, 14 Jun 2022 15:02:32 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 13/14] x86/tdx: Refactor try_accept_one() Date: Tue, 14 Jun 2022 15:02:30 +0300 Message-Id: <20220614120231.48165-14-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208166; a=rsa-sha256; cv=none; b=Pk91wjOp8GxBYSM9GxH6ybe5L9dYls27uZHH4ue14N9W/evs1VLPhpEegCGgoSOLj4REfH dSVuxDTAPvpp8uGZN0plj5fb+spjyPrXjygqqMeylz5fW9QsE5LKNqpfKsjsX+c0MIQKps MoU2Bt08dKOlz3xO+UeLMaWb9ZsqzRs= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=isjgJUoU; spf=none (imf21.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.31) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208166; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=q3NgFyDXpOAWDgAdVkOq/B81VNBQQ/jVFQLBdNrVxCg=; b=tNy/RMt+SuTUPRsQd9F6FL1A5cydOL2qt4jTe3iXx/0TdMqSSNh2nsCaob4HblVH0Jc+Gk OKDtVsJmzMz8mBmK9W5jSUQfiDl859yNMkQ9ay/BB5CfMMZwoVXOufZLoFk/wO+EPZ5d35 o5bl5azN9Djdy0yFOignIgLbelSQmOM= X-Stat-Signature: nscd4yi8zu1oyudj6qep4cnm6wdsrh4r X-Rspamd-Queue-Id: E1B111C00B6 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=isjgJUoU; spf=none (imf21.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.31) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1655208165-502580 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Rework try_accept_one() to return accepted size instead of modifying 'start' inside the helper. It makes 'start' in-only argumaent and streamlines code on the caller side. Signed-off-by: Kirill A. Shutemov Suggested-by: Borislav Petkov Reviewed-by: Dave Hansen --- arch/x86/coco/tdx/tdx.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index b75fe670032b..b10c95307f6e 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -624,18 +624,18 @@ static bool tdx_cache_flush_required(void) return true; } -static bool try_accept_one(phys_addr_t *start, unsigned long len, - enum pg_level pg_level) +static unsigned long try_accept_one(phys_addr_t start, unsigned long len, + enum pg_level pg_level) { unsigned long accept_size = page_level_size(pg_level); u64 tdcall_rcx; u8 page_size; - if (!IS_ALIGNED(*start, accept_size)) - return false; + if (!IS_ALIGNED(start, accept_size)) + return 0; if (len < accept_size) - return false; + return 0; /* * Pass the page physical address to the TDX module to accept the @@ -654,15 +654,14 @@ static bool try_accept_one(phys_addr_t *start, unsigned long len, page_size = 2; break; default: - return false; + return 0; } - tdcall_rcx = *start | page_size; + tdcall_rcx = start | page_size; if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL)) - return false; + return 0; - *start += accept_size; - return true; + return accept_size; } /* @@ -699,21 +698,22 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) */ while (start < end) { unsigned long len = end - start; + unsigned long accept_size; /* * Try larger accepts first. It gives chance to VMM to keep - * 1G/2M SEPT entries where possible and speeds up process by - * cutting number of hypercalls (if successful). + * 1G/2M Secure EPT entries where possible and speeds up + * process by cutting number of hypercalls (if successful). */ - if (try_accept_one(&start, len, PG_LEVEL_1G)) - continue; - - if (try_accept_one(&start, len, PG_LEVEL_2M)) - continue; - - if (!try_accept_one(&start, len, PG_LEVEL_4K)) + accept_size = try_accept_one(start, len, PG_LEVEL_1G); + if (!accept_size) + accept_size = try_accept_one(start, len, PG_LEVEL_2M); + if (!accept_size) + accept_size = try_accept_one(start, len, PG_LEVEL_4K); + if (!accept_size) return false; + start += accept_size; } return true; From patchwork Tue Jun 14 12:02:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12880970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE276C43334 for ; Tue, 14 Jun 2022 12:03:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CBC18D0249; Tue, 14 Jun 2022 08:02:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 57B888D0248; Tue, 14 Jun 2022 08:02:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 41C4E8D0249; Tue, 14 Jun 2022 08:02:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 2CE3E8D0248 for ; Tue, 14 Jun 2022 08:02:47 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id 0752181717 for ; Tue, 14 Jun 2022 12:02:47 +0000 (UTC) X-FDA: 79576704774.04.58DB358 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by imf01.hostedemail.com (Postfix) with ESMTP id 56CD5400AE for ; Tue, 14 Jun 2022 12:02:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208166; x=1686744166; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7oQeAKlAd2ATqVqYodMfjK9yYRqZpDnGTOvOo+Mth6U=; b=T55ZTwc8BOh1SFp01bP9uwSJdxAS4rKiMABOlHw5DgyWUpI/wjI+GrIk CD4NCMfEXTDY7uItnRm5vpAPfmBdrWcgBLEeRt1Nlkrys5uIpDcrUdpjS BZ1HMeyY53RH3X2vKG5dwfW6qy5JZu/zJSLapBIIXXIwXXZy9IsuYb8ut Q5cb+CncxlwIA28LzHDXS43Pr9kta4folz4lUC3N2Uav3sTrYf7P98UtQ 0rix2hBNfOAv84xzdNSyWhBh5Uu2K2s/VtuvzdXT7Ja0qrw139+odYLSZ m3xqHiXuvMY2K6Bai+86ShJ7AzI098Ism8uoHSVtAlaWpy/mpIX5vl0HV w==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="340260659" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="340260659" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:02:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="640332833" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga008.fm.intel.com with ESMTP; 14 Jun 2022 05:02:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 0FCEDA3F; Tue, 14 Jun 2022 15:02:33 +0300 (EEST) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv7 14/14] x86/tdx: Add unaccepted memory support Date: Tue, 14 Jun 2022 15:02:31 +0300 Message-Id: <20220614120231.48165-15-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655208166; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7WM3U/1l8iBLqZlOWtq93syLdTjvEhFleBf669K4l7Q=; b=nzQyq4F6l+ljyfinfhOOFHdA/dOyUAUPW0qYXpgKPorReRlA3thw8JWq7vmAFhtH81L4ME AZZKMkf77+jUN33IvTKfKNNLNfRbWyV0J2CJzeJlfwDLPYeRBKeE7PK3rlPosQ/9Ce/EjZ /BGWpdiws+bovI7+UPDjLshsn7COhTQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655208166; a=rsa-sha256; cv=none; b=rHT2hruQmBnfP8Hlw2nmvA7gHEBXhOBMJhu4g384tGfizF8ik6Bv0GzR3sI8pMoMFW4leP 7Yww25MyRn55uGGTnzvZrc1JUm1xACpE5y+yKQMQ6vfvmF1KfyaMT+DrOYQjV72EameX1p 1W7pV8A8Hdf0ZFWDhmnb9QkA+yIDmuI= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=T55ZTwc8; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf01.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.31) smtp.mailfrom=kirill.shutemov@linux.intel.com Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=T55ZTwc8; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf01.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.31) smtp.mailfrom=kirill.shutemov@linux.intel.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: p7kjepnfdx5axbxxotiwe93z8wuaacss X-Rspamd-Queue-Id: 56CD5400AE X-HE-Tag: 1655208166-982338 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hookup TDX-specific code to accept memory. Accepting the memory is the same process as converting memory from shared to private: kernel notifies VMM with MAP_GPA hypercall and then accept pages with ACCEPT_PAGE module call. The implementation in core kernel uses tdx_enc_status_changed(). It already used for converting memory to shared and back for I/O transactions. Boot stub provides own implementation of tdx_accept_memory(). It is similar in structure to tdx_enc_status_changed(), but only cares about converting memory to private. Signed-off-by: Kirill A. Shutemov --- arch/x86/Kconfig | 1 + arch/x86/boot/compressed/mem.c | 27 ++++++++++- arch/x86/boot/compressed/tdx.c | 78 +++++++++++++++++++++++++++++++ arch/x86/coco/tdx/tdx.c | 30 ++++++++---- arch/x86/include/asm/shared/tdx.h | 2 + arch/x86/mm/unaccepted_memory.c | 9 +++- 6 files changed, 136 insertions(+), 11 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 9783ebc4e021..80683afa5749 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -892,6 +892,7 @@ config INTEL_TDX_GUEST select ARCH_HAS_CC_PLATFORM select X86_MEM_ENCRYPT select X86_MCE + select UNACCEPTED_MEMORY help Support running as a guest under Intel TDX. Without this support, the guest kernel can not boot or run under TDX. diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c index b45458af00ca..48e36e640da1 100644 --- a/arch/x86/boot/compressed/mem.c +++ b/arch/x86/boot/compressed/mem.c @@ -5,6 +5,8 @@ #include "error.h" #include "find.h" #include "math.h" +#include "tdx.h" +#include #define PMD_SHIFT 21 #define PMD_SIZE (_AC(1, UL) << PMD_SHIFT) @@ -12,10 +14,33 @@ extern struct boot_params *boot_params; +static bool is_tdx_guest(void) +{ + static bool once; + static bool is_tdx; + + if (!IS_ENABLED(CONFIG_INTEL_TDX_GUEST)) + return false; + + if (!once) { + u32 eax, sig[3]; + + cpuid_count(TDX_CPUID_LEAF_ID, 0, &eax, + &sig[0], &sig[2], &sig[1]); + is_tdx = !memcmp(TDX_IDENT, sig, sizeof(sig)); + once = true; + } + + return is_tdx; +} + static inline void __accept_memory(phys_addr_t start, phys_addr_t end) { /* Platform-specific memory-acceptance call goes here */ - error("Cannot accept memory"); + if (is_tdx_guest()) + tdx_accept_memory(start, end); + else + error("Cannot accept memory: unknown platform\n"); } /* diff --git a/arch/x86/boot/compressed/tdx.c b/arch/x86/boot/compressed/tdx.c index 918a7606f53c..8518a75e5dd5 100644 --- a/arch/x86/boot/compressed/tdx.c +++ b/arch/x86/boot/compressed/tdx.c @@ -3,12 +3,15 @@ #include "../cpuflags.h" #include "../string.h" #include "../io.h" +#include "align.h" #include "error.h" +#include "pgtable_types.h" #include #include #include +#include /* Called from __tdx_hypercall() for unrecoverable failure */ void __tdx_hypercall_failed(void) @@ -75,3 +78,78 @@ void early_tdx_detect(void) pio_ops.f_outb = tdx_outb; pio_ops.f_outw = tdx_outw; } + +static unsigned long try_accept_one(phys_addr_t start, unsigned long len, + enum pg_level level) +{ + unsigned long accept_size = PAGE_SIZE << ((level - 1) * PTE_SHIFT); + u64 tdcall_rcx; + u8 page_size; + + if (!IS_ALIGNED(start, accept_size)) + return 0; + + if (len < accept_size) + return 0; + + /* + * Pass the page physical address to the TDX module to accept the + * pending, private page. + * + * Bits 2:0 of RCX encode page size: 0 - 4K, 1 - 2M, 2 - 1G. + */ + switch (level) { + case PG_LEVEL_4K: + page_size = 0; + break; + case PG_LEVEL_2M: + page_size = 1; + break; + case PG_LEVEL_1G: + page_size = 2; + break; + default: + return 0; + } + + tdcall_rcx = start | page_size; + if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL)) + return 0; + + return accept_size; +} + +void tdx_accept_memory(phys_addr_t start, phys_addr_t end) +{ + /* + * Notify the VMM about page mapping conversion. More info about ABI + * can be found in TDX Guest-Host-Communication Interface (GHCI), + * section "TDG.VP.VMCALL" + */ + if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0)) + error("Accepting memory failed\n"); + + /* + * For shared->private conversion, accept the page using + * TDX_ACCEPT_PAGE TDX module call. + */ + while (start < end) { + unsigned long len = end - start; + unsigned long accept_size; + + /* + * Try larger accepts first. It gives chance to VMM to keep + * 1G/2M Secure EPT entries where possible and speeds up + * process by cutting number of hypercalls (if successful). + */ + + accept_size = try_accept_one(start, len, PG_LEVEL_1G); + if (!accept_size) + accept_size = try_accept_one(start, len, PG_LEVEL_2M); + if (!accept_size) + accept_size = try_accept_one(start, len, PG_LEVEL_4K); + if (!accept_size) + error("Accepting memory failed\n"); + start += accept_size; + } +} diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index b10c95307f6e..8240f04d3646 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -664,16 +664,9 @@ static unsigned long try_accept_one(phys_addr_t start, unsigned long len, return accept_size; } -/* - * Inform the VMM of the guest's intent for this physical page: shared with - * the VMM or private to the guest. The VMM is expected to change its mapping - * of the page in response. - */ -static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) +static bool tdx_enc_status_changed_phys(phys_addr_t start, phys_addr_t end, + bool enc) { - phys_addr_t start = __pa(vaddr); - phys_addr_t end = __pa(vaddr + numpages * PAGE_SIZE); - if (!enc) { /* Set the shared (decrypted) bits: */ start |= cc_mkdec(0); @@ -719,6 +712,25 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) return true; } +void tdx_accept_memory(phys_addr_t start, phys_addr_t end) +{ + if (!tdx_enc_status_changed_phys(start, end, true)) + panic("Accepting memory failed: %#llx-%#llx\n", start, end); +} + +/* + * Inform the VMM of the guest's intent for this physical page: shared with + * the VMM or private to the guest. The VMM is expected to change its mapping + * of the page in response. + */ +static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) +{ + phys_addr_t start = __pa(vaddr); + phys_addr_t end = __pa(vaddr + numpages * PAGE_SIZE); + + return tdx_enc_status_changed_phys(start, end, enc); +} + void __init tdx_early_init(void) { u64 cc_mask; diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h index 956ced04c3be..97534c334473 100644 --- a/arch/x86/include/asm/shared/tdx.h +++ b/arch/x86/include/asm/shared/tdx.h @@ -81,5 +81,7 @@ struct tdx_module_output { u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, struct tdx_module_output *out); +void tdx_accept_memory(phys_addr_t start, phys_addr_t end); + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_SHARED_TDX_H */ diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c index 05e216716690..9ec2304272dc 100644 --- a/arch/x86/mm/unaccepted_memory.c +++ b/arch/x86/mm/unaccepted_memory.c @@ -7,6 +7,7 @@ #include #include +#include #include /* Protects unaccepted memory bitmap */ @@ -62,7 +63,13 @@ void accept_memory(phys_addr_t start, phys_addr_t end) unsigned long len = range_end - range_start; /* Platform-specific memory-acceptance call goes here */ - panic("Cannot accept memory: unknown platform\n"); + if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) { + tdx_accept_memory(range_start * PMD_SIZE, + range_end * PMD_SIZE); + } else { + panic("Cannot accept memory: unknown platform\n"); + } + bitmap_clear(bitmap, range_start, len); } spin_unlock_irqrestore(&unaccepted_memory_lock, flags);