From patchwork Thu Mar 30 11:49:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194073 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A03FEC761A6 for ; Thu, 30 Mar 2023 11:50:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 880786B0080; Thu, 30 Mar 2023 07:50:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 855C86B0081; Thu, 30 Mar 2023 07:50:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6CEA56B0082; Thu, 30 Mar 2023 07:50:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 605D86B0080 for ; Thu, 30 Mar 2023 07:50:38 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 37C46AC605 for ; Thu, 30 Mar 2023 11:50:38 +0000 (UTC) X-FDA: 80625397356.14.1727B34 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id 10291C000B for ; Thu, 30 Mar 2023 11:50:35 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=MpqYtae1; spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177036; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fY9Gq1C+euQfN8HX4Dp9v5W//o61MTHQI2rmCiyIkTM=; b=Dn06wBKkTaWT68u9rReAVN9Q4rmV9iqEJIAFa2noanpjvzyj0BpAoERM/EyqLLrZQobFvq 2qteYo5L4vuWv42BX4bExoWKrwX7C+GIc3uJOh/NwtUQgkG3KJX6dRxnoktzOcJaR9Zaje 0nlfTOXu5o3Q/oZTZPMDlYRSFypsM/Y= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=MpqYtae1; spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177036; a=rsa-sha256; cv=none; b=MlaW/TmQZPI/dUj2eJQx2SpGwk4NA6xZyhKmU1FcnUUWIx1hKVo1df/hL6vOct9ALbjhUa NKSWCl+4Vrs+hs9rsVSQ3XeSK298PdZU+K3uQREUT+vcsL/z8DB65U+GZQcfd7sy2CpWdu UY6d3KUw09ssA68UYVRdSp2Yyf2a5Wc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177036; x=1711713036; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bJTThgoaEyMmVXYNTmAj3X3A0F/zyVZqqh7eNmMOpAY=; b=MpqYtae12WeuogpfYULR9zDBfs0hufs1U2z3awV3Ns+dP/C8MAuElNaq f2Zca/aTQQMYamHIWWpVm9CmtHTJGQyGhMrlWUfG7eMJ66avJUHtTQmQY 2aZNOxH2pV2B1SO5nSNBFtFRyHs8evDjcZ7qhDGbflx33gyyPIQKfIcTS RyRm7MMbcU81+JWTil9QdhAoBchs/TmVUMNXkmME9lyCnAq9D7kyJIfCy kpxLn1pq9XgRedGzJgS4U+KwuLYA7EZrbVWA12IeBT1kJaxA6T0PHcfmP +M5LSOCFCJ9KKr/Ate1deYcnI9RNv6xaY3DCJtXnVrLIz9Exfq7WNQCYO w==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="339868387" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="339868387" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="1014401427" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="1014401427" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:02 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id E46BE10438C; Thu, 30 Mar 2023 14:49:59 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Mike Rapoport , Dave Hansen Subject: [PATCHv9 01/14] x86/boot: Centralize __pa()/__va() definitions Date: Thu, 30 Mar 2023 14:49:43 +0300 Message-Id: <20230330114956.20342-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 56mu41rms656pnqbkihjneimqom8ui8z X-Rspamd-Queue-Id: 10291C000B X-HE-Tag: 1680177035-550443 X-HE-Meta: U2FsdGVkX18zryyZuPDEattVr5K4e3xH8SjipOqo78zz7+PT9kt1NEkQRObJ09I0jLOIhNjEGQeb28XDVSNNDyOfy6CwBlFliPR2yisiqXsZfDLqCA5J6eyUGjsEaTnooJjbPDxmFyVgtfLv84bWPvZ8XQcLWUkGqk8m4G6smsYxHJ73zcYj+KzzM5zdAH96lLKbc25dnh9S4OvsgJeGnmDAd9+XA/u2zkPKBqDNZ0Jwx1F7W9rxQLw2uuwiq4jlvVqn4mhRKgP88p3DUKNx2V2FsHCU1Jo31M7YDzruTBd+m+M8JCV22BCgevZe+MctetNHjKHwE6x1gzjeKGuMOGihGX1c+TLWwNnvTJhsBlUElSv6SLML1Sn65HPWB5vBXAkR24MZIP3hlU28oXIc3EN/G3IrJ351AXmkxQbO+52lWZ75Q21s/SkyH4elzfPauOWgQWWo8d12R5hSA94U8KJkZWNLBm9N6ZLPnkrAqflzXygz0GQ88gD9+Jy9bIJZ1xXgezCcTfcVyTEw1QsPDIy7GPsmcR8VKlAxMyA8IytXGo4DfKtKWJDzJcTWBatyL84EKC66+CeWxHKfwLalY2r3k70GxYkm0EAu9sfp/j84qNmhQC3Ok/XEhhjGxu9kBI4q0TVFLlUleUA4nSPEh6IUYkp1mGwAPyOSq2ApAT2qqB626JImACjEyDIaLJ3kbfn4+vOFAl6KQ8Gx0ShbjRgOoP5Su3JVZxnFguU5ak0gZVNfAYY7qoJ8EmvtV0zA1vzf58DuH1p5znJ6o2qMqU7Q/sPz+zIb+v+D41zoEDjwCRQctsRO9peS0GGD/IwzilLSNaCgn8zJTbS2W9DMEjIRX4CTvDvDPlXF96lapaYZudC3eBJG5Q97025nrNSpmUZ+tPTlpddimAyJaRTrdlcHwiLoWtdl1ryiFu3nkQp72v0xWrpDj79po1j7UYXi+KI5jZ6xlxEMRZqB31h 8/R6+nDw 9FxYRZIEMx6cAqn74qywlUe/AH70z8yofzl+GwqqR6tFDFTP8LAVamPpzD8J3sgQ3SO+2nr0cBV8m7hqmsfwQ07sQ1xRh27xfE3EPcrn5ChjI2OtZr5pjvqTgoWZJJtvcEzgqe4kzg0p3ne0KW61mJmbazrG1HYcbwnL2/iQzB2E0QFg1Ok5Das31DAzlSjJ4p9Vz6qo5wpKB/Qh9aQuP595RGzE/icDRqw7bcUrBDmygALS44716BSXnwKejbKo3gp65 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Replace multiple __pa()/__va() definitions with a single one in misc.h. Signed-off-by: Kirill A. Shutemov Reviewed-by: David Hildenbrand Reviewed-by: Mike Rapoport Reviewed-by: Dave Hansen --- arch/x86/boot/compressed/ident_map_64.c | 8 -------- arch/x86/boot/compressed/misc.h | 9 +++++++++ arch/x86/boot/compressed/sev.c | 2 -- 3 files changed, 9 insertions(+), 10 deletions(-) diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c index 321a5011042d..bcc956c17872 100644 --- a/arch/x86/boot/compressed/ident_map_64.c +++ b/arch/x86/boot/compressed/ident_map_64.c @@ -8,14 +8,6 @@ * Copyright (C) 2016 Kees Cook */ -/* - * Since we're dealing with identity mappings, physical and virtual - * addresses are the same, so override these defines which are ultimately - * used by the headers in misc.h. - */ -#define __pa(x) ((unsigned long)(x)) -#define __va(x) ((void *)((unsigned long)(x))) - /* No PAGE_TABLE_ISOLATION support needed either: */ #undef CONFIG_PAGE_TABLE_ISOLATION diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index 20118fb7c53b..2f155a0e3041 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -19,6 +19,15 @@ /* cpu_feature_enabled() cannot be used this early */ #define USE_EARLY_PGTABLE_L5 +/* + * Boot stub deals with identity mappings, physical and virtual addresses are + * the same, so override these defines. + * + * will not define them if they are already defined. + */ +#define __pa(x) ((unsigned long)(x)) +#define __va(x) ((void *)((unsigned long)(x))) + #include #include #include diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index d63ad8f99f83..014b89c89088 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -104,9 +104,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt, } #undef __init -#undef __pa #define __init -#define __pa(x) ((unsigned long)(x)) #define __BOOT_COMPRESSED From patchwork Thu Mar 30 11:49:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194069 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1A28C761AF for ; Thu, 30 Mar 2023 11:50:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8A5966B0078; Thu, 30 Mar 2023 07:50:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 82E996B007B; Thu, 30 Mar 2023 07:50:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6596A6B007D; Thu, 30 Mar 2023 07:50:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 567276B0078 for ; Thu, 30 Mar 2023 07:50:36 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id F3A5EC0B14 for ; Thu, 30 Mar 2023 11:50:35 +0000 (UTC) X-FDA: 80625397230.03.06A897E Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id AFD35C0024 for ; Thu, 30 Mar 2023 11:50:33 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=IpURuqmX; spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177034; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=adaCvjbv4hhW96WqvWDJJxIi6tKE67/FRBSOoiO5XGo=; b=Tnbfr+EWU9oBNX+UVNKrczriak1HcdD9GkjFqKzCRqcXRWxzlQfb6uGoaO3o6Nj+aftzUs 2/KnslTZPiBeW1D1vw2TX6NLDS4wziHsv28xqbJervIwuUEQqU37U5cGmP3ClaQ5xUNFs3 xVWeLup71yxJ7/nqt2oCaBi4O+LjqYg= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=IpURuqmX; spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177034; a=rsa-sha256; cv=none; b=0FLmBPcmiqA6fU0zR51A+Ps2PbpiyuWYMUGeQ1YFR+D0inPEpaoZwfIX9duxwbWBPxMwEf Bnpe9INsSGJ4jxUeHfVQkSP8x4eG1thFM+HopJ/+yaI5XDJ+RFw3D3eeVF9+e1QrQWgDOq LCeaR0rz+XtOMJ6h7RrNTPNUI2nFzvI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177033; x=1711713033; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZA6Ow13DZ6fov0DkWaGCGd0ySWhtAHUVohuKyDxTtmM=; b=IpURuqmXzcvL5g7B3sX0l9MbeYEos3/fZU6P3YMbZDyWMBBno609r/t5 6UqJf7NqJ3RMyC/xcNGrRFX39J2cRWmwg8Fzw1iXpcDoiIvVzNfU3l6ph rwWVzthZ1FvFdIwN6OQ9dY7ZaXivH8rUnzpPCxA/giY9eiquErtbPNgHi 67o4wXfAaPMjeKhvxkB+v0IkTqrtHd7uKSp5YTfZnuunix/pDebq97GWT J08yUSHSekgAVDUCWKDkCGmKnDu4H4VeSy+8TfrCfLHnZzPjrNdNKRo/5 G6Euf5a93IE2ktPApjNXmUHSN3JkA4VjaPkH1AXV0+PqEETy5khFy7WEL w==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="339868374" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="339868374" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="1014401426" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="1014401426" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:02 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id EFD9310438D; Thu, 30 Mar 2023 14:49:59 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Mike Rapoport Subject: [PATCHv9 02/14] mm: Add support for unaccepted memory Date: Thu, 30 Mar 2023 14:49:44 +0300 Message-Id: <20230330114956.20342-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 49rjndj1ffzsf9q15mm6b7eyubtkn117 X-Rspamd-Queue-Id: AFD35C0024 X-HE-Tag: 1680177033-573281 X-HE-Meta: U2FsdGVkX1/G/turQ8FqqTLn5WC9D7+X9XT4uFIM59gVkX4Jf2/7SRj9xTXNuf8JkA3s7L3av+vno7+1SGfAFNBu/q5M1n1maddGSuLvpfe04Udix6yumH/TLYeU+nz/sJDm9JvwvVOctA7pP5dXv3KBDv7Ht8Zdp/LO/H8exo1X0MSf7aiDMgHhkuVWfgcHh4PGkzChyp3TJ5pyo5GQiH1g6rSiRmobWoo8jmAHJv9TnIaY46H7yC4wjMJaE4Yr83Jcx/NNg42OB0a6hCVtwHxexauSzZrdBcQ5Ft8WIIVeJ50mZbhc0QCzTuyk5QQ9IzYZ009BzaxSxar9vkb2y6o0n27GFVtnOM2jRo0CNpz9Cvpie3DuienQZqnYTr0VlcMgDU7UaibvNztjMq3azx/leUNvoKtGKkof8IiDQUtvYEZPuigUZfB9zNlWX9QEcNkmhTWGiLXNCiHlpQyWTj5+5XKU1arp/D/kOTqdPjaoVVLp1AqXRc1XUjrw44xph/7ku/jp1zQz0JGb9x07tOMOxkijDMEry4YyfkNib1jIhod2++BDwzNtD9TrfuUdMUkYJ3xJ39bXp4GAEeyAwnB3XjR7G2u2Ii5cOB9TYZGaVlqFmP3nCdnBq6bLIcvmIQwTHqmcksT21AlpfR/yhIkys94WjCOgGfYpfFZ01YlQPcnMrhqLcA6ucVWxbpxOxj6n56894IYeSCbnHS3tOoH7BMiD23HcFnBRZ7Ih92lYmEk+9/gAcXnfHXxDlknvp4z2cDCB+sjy96VR2cA/oj4Bh0AIU3q2QjcSvR9uYmGlEAO7urB4UJxMHnfCO5eOr+h7Rvhlg07n9jG3BeieP0UgPcjOgfLQh+71tb8owt7qvHfqxMle+qPoWjWj0VWsm0grdgEopk5zIdxMG6jt7ys28X2HcwfSnBNSxbnGfHzzWTrCZuRx/Qbw9HVmbLwT57QbFr35aiZYR81RO7w hZeHVnIN eGjlNtmEjZQRGsA2RhC+whkjmbsB5uxDu32TpTQZL/FhWZyNxiA6N8+0Srt9a0UkbSUeLbFfHfl/WE7r2VkFBv1qXgeIiUjb994TFcobsXnQGybr835/idQjlUdJugKA1PI0JrHsBz7I22JzYW3ciPmbLhy99APo8l5/00otC7iRevZE8f7sqfwvoCw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: UEFI Specification version 2.9 introduces the concept of memory acceptance. Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, require memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific to the Virtual Machine platform. There are several ways kernel can deal with unaccepted memory: 1. Accept all the memory during the boot. It is easy to implement and it doesn't have runtime cost once the system is booted. The downside is very long boot time. Accept can be parallelized to multiple CPUs to keep it manageable (i.e. via DEFERRED_STRUCT_PAGE_INIT), but it tends to saturate memory bandwidth and does not scale beyond the point. 2. Accept a block of memory on the first use. It requires more infrastructure and changes in page allocator to make it work, but it provides good boot time. On-demand memory accept means latency spikes every time kernel steps onto a new memory block. The spikes will go away once workload data set size gets stabilized or all memory gets accepted. 3. Accept all memory in background. Introduce a thread (or multiple) that gets memory accepted proactively. It will minimize time the system experience latency spikes on memory allocation while keeping low boot time. This approach cannot function on its own. It is an extension of #2: background memory acceptance requires functional scheduler, but the page allocator may need to tap into unaccepted memory before that. The downside of the approach is that these threads also steal CPU cycles and memory bandwidth from the user's workload and may hurt user experience. The patch implements #1 and #2 for now. #2 is the default. Some workloads may want to use #1 with accept_memory=eager in kernel command line. #3 can be implemented later based on user's demands. Support of unaccepted memory requires a few changes in core-mm code: - memblock has to accept memory on allocation; - page allocator has to accept memory on the first allocation of the page; Memblock change is trivial. The page allocator is modified to accept pages. New memory gets accepted before putting pages on free lists. It is done lazily: only accept new pages when we run out of already accepted memory. The memory gets accepted until the high watermark is reached. Architecture has to provide two helpers if it wants to support unaccepted memory: - accept_memory() makes a range of physical addresses accepted. - range_contains_unaccepted_memory() checks anything within the range of physical addresses requires acceptance. Signed-off-by: Kirill A. Shutemov Acked-by: Mike Rapoport # memblock Reviewed-by: Vlastimil Babka --- drivers/base/node.c | 7 ++ fs/proc/meminfo.c | 5 ++ include/linux/mmzone.h | 8 ++ mm/internal.h | 13 ++++ mm/memblock.c | 9 +++ mm/mm_init.c | 7 ++ mm/page_alloc.c | 161 +++++++++++++++++++++++++++++++++++++++++ mm/vmstat.c | 3 + 8 files changed, 213 insertions(+) diff --git a/drivers/base/node.c b/drivers/base/node.c index b46db17124f3..655975946ef6 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -448,6 +448,9 @@ static ssize_t node_read_meminfo(struct device *dev, "Node %d ShmemPmdMapped: %8lu kB\n" "Node %d FileHugePages: %8lu kB\n" "Node %d FilePmdMapped: %8lu kB\n" +#endif +#ifdef CONFIG_UNACCEPTED_MEMORY + "Node %d Unaccepted: %8lu kB\n" #endif , nid, K(node_page_state(pgdat, NR_FILE_DIRTY)), @@ -477,6 +480,10 @@ static ssize_t node_read_meminfo(struct device *dev, nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED)), nid, K(node_page_state(pgdat, NR_FILE_THPS)), nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED)) +#endif +#ifdef CONFIG_UNACCEPTED_MEMORY + , + nid, K(sum_zone_node_page_state(nid, NR_UNACCEPTED)) #endif ); len += hugetlb_report_node_meminfo(buf, len, nid); diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index b43d0bd42762..8dca4d6d96c7 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -168,6 +168,11 @@ static int meminfo_proc_show(struct seq_file *m, void *v) global_zone_page_state(NR_FREE_CMA_PAGES)); #endif +#ifdef CONFIG_UNACCEPTED_MEMORY + show_val_kb(m, "Unaccepted: ", + global_zone_page_state(NR_UNACCEPTED)); +#endif + hugetlb_report_meminfo(m); arch_report_meminfo(m); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 72837e019bd1..c5f50ad19870 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -148,6 +148,9 @@ enum zone_stat_item { NR_ZSPAGES, /* allocated in zsmalloc */ #endif NR_FREE_CMA_PAGES, +#ifdef CONFIG_UNACCEPTED_MEMORY + NR_UNACCEPTED, +#endif NR_VM_ZONE_STAT_ITEMS }; enum node_stat_item { @@ -919,6 +922,11 @@ struct zone { /* free areas of different sizes */ struct free_area free_area[MAX_ORDER + 1]; +#ifdef CONFIG_UNACCEPTED_MEMORY + /* Pages to be accepted. All pages on the list are MAX_ORDER */ + struct list_head unaccepted_pages; +#endif + /* zone flags, see below */ unsigned long flags; diff --git a/mm/internal.h b/mm/internal.h index c05ad651b515..748bfeac1fea 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1114,4 +1114,17 @@ struct vma_prepare { struct vm_area_struct *remove; struct vm_area_struct *remove2; }; + +#ifndef CONFIG_UNACCEPTED_MEMORY +static inline bool range_contains_unaccepted_memory(phys_addr_t start, + phys_addr_t end) +{ + return false; +} + +static inline void accept_memory(phys_addr_t start, phys_addr_t end) +{ +} +#endif + #endif /* __MM_INTERNAL_H */ diff --git a/mm/memblock.c b/mm/memblock.c index 7911224b1ed3..54f89d9ac98e 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1436,6 +1436,15 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, */ kmemleak_alloc_phys(found, size, 0); + /* + * Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, + * require memory to be accepted before it can be used by the + * guest. + * + * Accept the memory of the allocated buffer. + */ + accept_memory(found, found + size); + return found; } diff --git a/mm/mm_init.c b/mm/mm_init.c index dd3a6ed9663f..5e5afbefda1e 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1373,6 +1373,10 @@ static void __meminit zone_init_free_lists(struct zone *zone) INIT_LIST_HEAD(&zone->free_area[order].free_list[t]); zone->free_area[order].nr_free = 0; } + +#ifdef CONFIG_UNACCEPTED_MEMORY + INIT_LIST_HEAD(&zone->unaccepted_pages); +#endif } void __meminit init_currently_empty_zone(struct zone *zone, @@ -1958,6 +1962,9 @@ static void __init deferred_free_range(unsigned long pfn, return; } + /* Accept chunks smaller than MAX_ORDER upfront */ + accept_memory(PFN_PHYS(pfn), PFN_PHYS(pfn + nr_pages)); + for (i = 0; i < nr_pages; i++, page++, pfn++) { if (pageblock_aligned(pfn)) set_pageblock_migratetype(page, MIGRATE_MOVABLE); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0767dd6bc5ba..d62fcb2f28bd 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -387,6 +387,11 @@ EXPORT_SYMBOL(nr_node_ids); EXPORT_SYMBOL(nr_online_nodes); #endif +static bool page_contains_unaccepted(struct page *page, unsigned int order); +static void accept_page(struct page *page, unsigned int order); +static bool try_to_accept_memory(struct zone *zone, unsigned int order); +static bool __free_unaccepted(struct page *page); + int page_group_by_mobility_disabled __read_mostly; #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT @@ -1481,6 +1486,13 @@ void __free_pages_core(struct page *page, unsigned int order) atomic_long_add(nr_pages, &page_zone(page)->managed_pages); + if (page_contains_unaccepted(page, order)) { + if (order == MAX_ORDER && __free_unaccepted(page)) + return; + + accept_page(page, order); + } + /* * Bypass PCP and place fresh pages right to the tail, primarily * relevant for memory onlining. @@ -3150,6 +3162,9 @@ static inline long __zone_watermark_unusable_free(struct zone *z, if (!(alloc_flags & ALLOC_CMA)) unusable_free += zone_page_state(z, NR_FREE_CMA_PAGES); #endif +#ifdef CONFIG_UNACCEPTED_MEMORY + unusable_free += zone_page_state(z, NR_UNACCEPTED); +#endif return unusable_free; } @@ -3449,6 +3464,9 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, gfp_mask)) { int ret; + if (try_to_accept_memory(zone, order)) + goto try_this_zone; + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* * Watermark failed for this zone, but see if we can @@ -3501,6 +3519,9 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, return page; } else { + if (try_to_accept_memory(zone, order)) + goto try_this_zone; + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* Try again if zone has deferred pages */ if (deferred_pages_enabled()) { @@ -7184,3 +7205,143 @@ bool has_managed_dma(void) return false; } #endif /* CONFIG_ZONE_DMA */ + +#ifdef CONFIG_UNACCEPTED_MEMORY + +/* Counts number of zones with unaccepted pages. */ +static DEFINE_STATIC_KEY_FALSE(zones_with_unaccepted_pages); + +static bool lazy_accept = true; + +static int __init accept_memory_parse(char *p) +{ + if (!strcmp(p, "lazy")) { + lazy_accept = true; + return 0; + } else if (!strcmp(p, "eager")) { + lazy_accept = false; + return 0; + } else { + return -EINVAL; + } +} +early_param("accept_memory", accept_memory_parse); + +static bool page_contains_unaccepted(struct page *page, unsigned int order) +{ + phys_addr_t start = page_to_phys(page); + phys_addr_t end = start + (PAGE_SIZE << order); + + return range_contains_unaccepted_memory(start, end); +} + +static void accept_page(struct page *page, unsigned int order) +{ + phys_addr_t start = page_to_phys(page); + + accept_memory(start, start + (PAGE_SIZE << order)); +} + +static bool try_to_accept_memory_one(struct zone *zone) +{ + unsigned long flags; + struct page *page; + bool last; + + if (list_empty(&zone->unaccepted_pages)) + return false; + + spin_lock_irqsave(&zone->lock, flags); + page = list_first_entry_or_null(&zone->unaccepted_pages, + struct page, lru); + if (!page) { + spin_unlock_irqrestore(&zone->lock, flags); + return false; + } + + list_del(&page->lru); + last = list_empty(&zone->unaccepted_pages); + + __mod_zone_freepage_state(zone, -MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE); + __mod_zone_page_state(zone, NR_UNACCEPTED, -MAX_ORDER_NR_PAGES); + spin_unlock_irqrestore(&zone->lock, flags); + + accept_page(page, MAX_ORDER); + + __free_pages_ok(page, MAX_ORDER, FPI_TO_TAIL); + + if (last) + static_branch_dec(&zones_with_unaccepted_pages); + + return true; +} + +static bool try_to_accept_memory(struct zone *zone, unsigned int order) +{ + long to_accept; + int ret = false; + + if (!static_branch_unlikely(&zones_with_unaccepted_pages)) + return false; + + /* How much to accept to get to high watermark? */ + to_accept = high_wmark_pages(zone) - + (zone_page_state(zone, NR_FREE_PAGES) - + __zone_watermark_unusable_free(zone, order, 0)); + + /* Accept at least one page */ + do { + if (!try_to_accept_memory_one(zone)) + break; + ret = true; + to_accept -= MAX_ORDER_NR_PAGES; + } while (to_accept > 0); + + return ret; +} + +static bool __free_unaccepted(struct page *page) +{ + struct zone *zone = page_zone(page); + unsigned long flags; + bool first = false; + + if (!lazy_accept) + return false; + + spin_lock_irqsave(&zone->lock, flags); + first = list_empty(&zone->unaccepted_pages); + list_add_tail(&page->lru, &zone->unaccepted_pages); + __mod_zone_freepage_state(zone, MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE); + __mod_zone_page_state(zone, NR_UNACCEPTED, MAX_ORDER_NR_PAGES); + spin_unlock_irqrestore(&zone->lock, flags); + + if (first) + static_branch_inc(&zones_with_unaccepted_pages); + + return true; +} + +#else + +static bool page_contains_unaccepted(struct page *page, unsigned int order) +{ + return false; +} + +static void accept_page(struct page *page, unsigned int order) +{ +} + +static bool try_to_accept_memory(struct zone *zone, unsigned int order) +{ + return false; +} + +static bool __free_unaccepted(struct page *page) +{ + BUILD_BUG(); + return false; +} + +#endif /* CONFIG_UNACCEPTED_MEMORY */ diff --git a/mm/vmstat.c b/mm/vmstat.c index 0a6d742322db..16ec8b994ef3 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1256,6 +1256,9 @@ const char * const vmstat_text[] = { "nr_zspages", #endif "nr_free_cma", +#ifdef CONFIG_UNACCEPTED_MEMORY + "nr_unaccepted", +#endif /* enum numa_stat_item counters */ #ifdef CONFIG_NUMA From patchwork Thu Mar 30 11:49:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194067 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2F71C77B60 for ; Thu, 30 Mar 2023 11:50:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 87EEF6B0074; Thu, 30 Mar 2023 07:50:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 807AF6B0075; Thu, 30 Mar 2023 07:50:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6CEE66B0078; Thu, 30 Mar 2023 07:50:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5EA2D6B0074 for ; Thu, 30 Mar 2023 07:50:33 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1F5151403C0 for ; Thu, 30 Mar 2023 11:50:33 +0000 (UTC) X-FDA: 80625397146.25.1AF53AE Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf16.hostedemail.com (Postfix) with ESMTP id 0BB44180006 for ; Thu, 30 Mar 2023 11:50:30 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=jJl+1eQB; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf16.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177031; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3Xcx87+8at3w01ECUa4i8O3uXEYYImpaB9iORG5znps=; b=NquOq9PkBOww+nMMO/P6OHc9Y5cC7wjJ8VjJGBWw62fEG+67KendE6OYeFhDykhk4jhBCW FnaNuP18KB1mW9zBfQdWx9nS8es9d6XeE5RhDYSVjVkWL/Uurk7kgyBmr4cbW2aG4MnDmo IlCX6FbzVXnSagcWDDu6HQcLzjLL1hA= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=jJl+1eQB; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf16.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177031; a=rsa-sha256; cv=none; b=QPu5MbQR4G0t4ElSKYTQTyX2aZgovK/NYmDyCpSB97JwSbXUyY+mRr1S2za360rAqqxiQw LI8lBQG+Soy/lnnXlTFpmkcVEj6qCbiW6cJiQ+J9ykMVi7+56LExILqRejdvP2fn+P6TLG NimrFRUkpcyvRBaMe28TMqS5vJlmb+0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177031; x=1711713031; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=j9rCoZtseVkMxXJK20bhOG6vjoCUVN6BwYUe9lsjGyo=; b=jJl+1eQBQQejUCWSPKVgpFg5Jv+fiP9ujTka/dwPCrfhnD91l75vC2V8 lYfxcAbCGvzB38MWhmthHwv9kTLOW4E3jqrYUwkMOCobSZNZOMLBvGbUC C3PTnNzbh+zxpI2FkevBtvAZUwgHoD/jnoo6Lh6nSqBh3LRU8O/TMQOsB KZB0g/YoOI8GMZMBrY7j32QTX13PuynkQPl7ZFTHRi7nY+5TnkWePUzL7 gTeQ8IytE/wF2ZgtFjGm80AIpuo00av31vjZ84psxZaeqNmgAfpfPny0g tCMn6OoqodBOJ6BwqxAWgOkaHisvCpK3HJVZb4hVuzhooVm0I+WHHYzYH Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="342756721" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="342756721" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634856482" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="634856482" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:02 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 0612B10438E; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv9 03/14] mm/page_alloc: Fake unaccepted memory Date: Thu, 30 Mar 2023 14:49:45 +0300 Message-Id: <20230330114956.20342-4-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 0BB44180006 X-Stat-Signature: fc1xr7p94xtocbkqg5tzp8pgfw3o9c9d X-HE-Tag: 1680177030-114844 X-HE-Meta: U2FsdGVkX1/atePPej1p2GmGpR4Pr/mWeyj1kkUGStGwdTvYcpW1DgfnBlC8s0aI1yXut49cXfVEGx/kOYrbR1Q+91p4tizrZlIYk28bOVZFUZr2DsZKKXaLDDEofjBlCb1A1Lqcz2YcMD1VhcD59dhLiyXZHuQS7u6YcQLuKnP8TzyuAmuJZtL6z0BI5tNRnmvLUB6im0BqsK6BF3Y3QWXWhmcHNX5N4aBktdKk5vPK1nbaaThIh8kzJ4Bs9jikNF1tZPGjQ/8VKTnH1pneYISuWUiGI7wmkwweBGX0C1LeymXz0z8JEJButkJDf9U/SwfN+KD35OjN8mATTtrElSwTKApxZ7lIp5cvlHzfu3B45gDzK4gTTYq1iD+dBAcoteZBWhnNysTueGh73dm+TK2Sc+nNmks1NTZcl7PBI/uXYcq2k8EX8d5xTniJUwtj/GxgK5QisdtYqEbO8rdzEZ60o30UZXTRZYGVYyoMoektmOc3PiFS6f1ad0uymiMSgZ5VXarbKRFClLkDJTvGi2xojl3mFJB0qp8J5XiEjjUJmdSgvjGwtTfUq3SzhoAqdj8CoSZXcEW7G5GwxVj7cZsOgMck6Btf+uvwkzG7Juul+C5yLpSXBjoMuvTCLvMCdd6Is12YE8C4CPK6G1ePlS4688RGCc77sr5Rdlyo7cSDcxFZbKw2gHlYpWOykJNQS43ZpVOrrzBsn2T0hhP5Srqssk+KZZYQcveynATXO4AWOihCmJtXVLIUFdmoo+fDzDm02lbKvFWqxwtn9DoVuzJ11sK+UhzOUv54/OQ6J9Kzxkg1cEW3aQZYrQNsXdF+FPBl8IPkkDL4qXUcrtejo1bNHnsBiUOu5647L8l49hQpVJGRoxgZ6k/Vr3NcO+3gVOKbEmpY5S7P7rYaWFPEIzlcvReJselTRSncjIiXHUuY7NHfHcYvAsVn2tqDWqFN3Zpn/Dm3H//SnAwYEpW LWlEnk7d NqmDLXrtscSP1VBzgfX240kVxEbc09n9wso1uMdl/cRMzIbgX/AFnTl7EKbI+wQlGi2/DgHWlRX55GVuNcZCcu75B/28Yl2t4BYoAH+TfkssNagXqOEdnEixSnHIosbbld6E3KKrYkh7Mzn+z+DjClJm/cQQ8jLSNg4PssieJiUdsQU12By98vjdzku+NJqs8XOgxgloSMShQg0k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: For testing purposes, it is useful to fake unaccepted memory in the system. It helps to understand unaccepted memory overhead to the page allocator. The patch allows to treat memory above the specified physical memory address as unaccepted. The change only fakes unaccepted memory for page allocator. Memblock is not affected. It also assumes that arch-provided accept_memory() on already accepted memory is a nop. Signed-off-by: Kirill A. Shutemov --- mm/page_alloc.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d62fcb2f28bd..509a93b7e5af 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7213,6 +7213,8 @@ static DEFINE_STATIC_KEY_FALSE(zones_with_unaccepted_pages); static bool lazy_accept = true; +static unsigned long fake_unaccepted_start = -1UL; + static int __init accept_memory_parse(char *p) { if (!strcmp(p, "lazy")) { @@ -7227,11 +7229,30 @@ static int __init accept_memory_parse(char *p) } early_param("accept_memory", accept_memory_parse); +static int __init fake_unaccepted_start_parse(char *p) +{ + if (!p) + return -EINVAL; + + fake_unaccepted_start = memparse(p, &p); + + if (*p != '\0') { + fake_unaccepted_start = -1UL; + return -EINVAL; + } + + return 0; +} +early_param("fake_unaccepted_start", fake_unaccepted_start_parse); + static bool page_contains_unaccepted(struct page *page, unsigned int order) { phys_addr_t start = page_to_phys(page); phys_addr_t end = start + (PAGE_SIZE << order); + if (start >= fake_unaccepted_start) + return true; + return range_contains_unaccepted_memory(start, end); } From patchwork Thu Mar 30 11:49:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194066 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A1F7C761AF for ; Thu, 30 Mar 2023 11:50:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 439316B0072; Thu, 30 Mar 2023 07:50:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E87F6B0074; Thu, 30 Mar 2023 07:50:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B0836B0075; Thu, 30 Mar 2023 07:50:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 1D1F96B0072 for ; Thu, 30 Mar 2023 07:50:31 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id CD697AC5CD for ; Thu, 30 Mar 2023 11:50:30 +0000 (UTC) X-FDA: 80625397020.12.6B832BD Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf16.hostedemail.com (Postfix) with ESMTP id ED186180006 for ; Thu, 30 Mar 2023 11:50:27 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=UVNnafg5; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf16.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177028; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mlANSXqkRIS+uafV6MvTmd7E01TJ70/P1xceNx67Pmo=; b=BreC/F8SpPWnf+TWm3Nqn2zueVOa3TUYggsNqc/KuExmQW0FrDSg5Nlo0/b4BpBPfz9q2R 6qjB8vImlZo63QkCroA763NknIKrOUmBW243DeBJ8ADmtxamVvoEd7xtEq6owrezjKrjeo HB5Br6nJwS5kIyHZcUjDdWGNiSV1dC4= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=UVNnafg5; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf16.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177028; a=rsa-sha256; cv=none; b=awf9s5GDRwQT0qlQUaDT2ayODQyPTf8LSz/by/938x+lMaNXvD58u2nyMa5nvDf3ypNk2Y 6tWRhmMMZjMjZnT9OCmGIucwAlWRauL1OpacBdkuG9g6cAkZgGId47vSC7zqAnUBw52oSJ oAjOAD2UuhahGGh4nPhuR75NRVK/6g8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177028; x=1711713028; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4/nsBozr9Yxm1Svh3lW2TBRBUO1KN1TDzMtK6uriOCA=; b=UVNnafg50byU7pKBgHR2DYWmg9FoK4sz46suyb1fzfvDvvWdN5i9zdAL cf2t/hz6HmV1rBOOIHD2leQG6D+48/VufU5imq6bAQzTUyOmTFgmO6UlB FjDYdYVgHCbfY0EfS62X4MfKodbwMKGINpumJG30/A7nmKBBQtULjsYxZ juvt0qVLwoqLHJ5bKiVbB5a/IpOYe33+BOPn7yrxgdhWvBdzLXtWKnwYn 5cI8kH0iYtwyjm34RwLNn3AKVcOIy6O52I73fueRTdgZ7Qnd8tFd1dl0h RqtUpf4a0Fhj7IBf9S7jiJRJuYXl/ci+Lf4uqf06bk+jbNTSFc7g4/Cve A==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="342756686" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="342756686" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634856368" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="634856368" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:02 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 105B310438F; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv9 04/14] mm/page_alloc: Add sysfs handle to accept accept_memory Date: Thu, 30 Mar 2023 14:49:46 +0300 Message-Id: <20230330114956.20342-5-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: ED186180006 X-Stat-Signature: qsn6yh4911yusp5bem11i9i5c78x6foz X-HE-Tag: 1680177027-653922 X-HE-Meta: U2FsdGVkX1/nlAWeRSLDGdCrPopI+vPjokuolV+v3M02UmJoKRMyPiakBbKWGDzMYa0yRcqJPPnuhUGXqg3JXhbWmELrAOaP+tPxTOjX/K3XZJZCRAFPj9FtpLyvGIiLKsdH/YVLBGUh0SJk7EupDvOHXzexRUTkNuKvLQI71+IZdTJSlDnOQdOKPPYaS5M7cIHWmMH2iSkrUPG5u0QJ4amDvHJjkrxTQAn2GT73/nBTaRpa46Zr6hwuspFXM6EK2SRhZxCzywJfviQQ5WcuqwQ4LB16Zab6Sz3l2R03u2ih6TcnqC2dDkeVZz9lmnamtP9wWFECdhfYFg81VpPev+UKkJvNATJvvp0NlIOPn4xzuJI97gdIa/xrGXhwSe2obOy2kw7MIRAm7gk7DnNlg92onIRJqcoXlc5+zg0WLLvmEbCsMo0mlQcIsr9oyVNz4gLTubfHQQP+Rk500FmUmjxLIjh4Yr/JwQNQV/+uwU9YdMocL0ETEU0Y0BCmmWGCRURZn8xaMJNeeX28eNq7S2X+M2JgGGIdju3cLmiXpNr3RX7BNGY/TH5XxbYb3V1Y6GGt//qeytSLua64fq5L3JFhezCLqbieB1DwZ7zWvFExE91DPTUnO5b2u9787UcM4tEdfJ5lWOymgHBZzU88z4fOz41Us4zO9gPESMxz6ofnyRHnIxnRV8QE5m+H18+O1tX9x2yZTy8khx44L0/YZwz5p/aiHH4rW7K74a4Ynsf6hxnzkdpTQmMMkp6YyvuNoqGtgFLgdx4GUkG+huWDrlz0Pw9aLGUHfgwerWKj7OfkQJ7uV25GTkr92KJmEN83edEn1pPR+Hst4rKuxirhQtkWqbMjRw4At62Id4GllKCoKj0VZKKYBXmuayx2PjyAxw3IMuZAUT5Zjuotn2rRmN7dkCRtxF7kOUsOx7raIvWoHNKZHZ7brkhFc9IznO2bbXnpKlknzEhmVrx1RAI +T8tzw9X Ghm74bhh1/VX+u8A02wzRdiuhK5tDQGPfqC0cWUTZ3oiuB5V3Su4OvHBpTYPCdYPx3UX3at9hbF3mjaUM5G3LHXYsB4aOP6CXoXwwCozWGwyXU258HyIkNUBrf8K0zGa/SDFANCPjea4kze7u+h2y0MlDa8+T1cyYO26ywvRECdnQ60Et5j+B9zNdcgDyoiICBrMxKGgVJRmV9yc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Write amount of memory to accept into the new sysfs handle /sys/kernel/mm/page_alloc/accept_memory. Write 'all' to the handle to accept all memory in the system. It can be used to implement background memory accepting from userspace. It is also useful for debugging. Signed-off-by: Kirill A. Shutemov --- mm/page_alloc.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 509a93b7e5af..07e16e9b49c4 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7343,6 +7343,45 @@ static bool __free_unaccepted(struct page *page) return true; } +static ssize_t accept_memory_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + unsigned long to_accept = 0; + struct zone *zone; + char *retptr; + + if (sysfs_streq(buf, "all")) { + to_accept = ULONG_MAX; + } else { + to_accept = memparse(buf, &retptr); + + /* Get rid of trailing whitespace, including '\n' */ + retptr = skip_spaces(retptr); + + if (*retptr != 0 || to_accept == 0) + return -EINVAL; + } + + for_each_populated_zone(zone) { + while (try_to_accept_memory_one(zone)) { + if (to_accept <= PAGE_SIZE << MAX_ORDER) + return count; + + to_accept -= PAGE_SIZE << MAX_ORDER; + } + } + + return count; +} + +static struct kobj_attribute accept_memory_attr = __ATTR_WO(accept_memory); + +static struct attribute *page_alloc_attr[] = { + &accept_memory_attr.attr, + NULL +}; + #else static bool page_contains_unaccepted(struct page *page, unsigned int order) @@ -7366,3 +7405,28 @@ static bool __free_unaccepted(struct page *page) } #endif /* CONFIG_UNACCEPTED_MEMORY */ + +static const struct attribute_group page_alloc_attr_group = { +#ifdef CONFIG_UNACCEPTED_MEMORY + .attrs = page_alloc_attr, +#endif +}; + +static int __init page_alloc_init_sysfs(void) +{ + struct kobject *page_alloc_kobj; + int err; + + page_alloc_kobj = kobject_create_and_add("page_alloc", mm_kobj); + if (!page_alloc_kobj) + return -ENOMEM; + + err = sysfs_create_group(page_alloc_kobj, &page_alloc_attr_group); + if (err) { + kobject_put(page_alloc_kobj); + return err; + } + + return 0; +} +late_initcall(page_alloc_init_sysfs); From patchwork Thu Mar 30 11:49:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194072 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40FDCC77B60 for ; Thu, 30 Mar 2023 11:50:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 350A66B007E; Thu, 30 Mar 2023 07:50:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3018B6B0080; Thu, 30 Mar 2023 07:50:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 154886B0081; Thu, 30 Mar 2023 07:50:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 07CE96B007E for ; Thu, 30 Mar 2023 07:50:38 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 671D1160F29 for ; Thu, 30 Mar 2023 11:50:37 +0000 (UTC) X-FDA: 80625397314.23.1BBDE59 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf24.hostedemail.com (Postfix) with ESMTP id 3F092180015 for ; Thu, 30 Mar 2023 11:50:35 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ACkDroNn; spf=none (imf24.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177035; a=rsa-sha256; cv=none; b=FrXtNK479f2Iv3g7xu72PyRaQDO6lId5yMnMryJTNVfBVedeJ5Ok+whXqlFN34bwvbh+9l koVbgH/RQ1k/2adHo00q/K6GnegqPHnEviIq4CXnQOAA90PwUYfPH6kW1YS7AOg88V2n5S SPn0N4A2FdJISN0nmo8IYYpWLbEsJFE= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ACkDroNn; spf=none (imf24.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177035; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ruFNRYlkeA7U/LySXAwK77bGHsdZfakSttMzy8rXOc4=; b=fscNm0BRS+Jp+mKcy9aenVGLewTV789zwYc2JuXk8KRbdVpiS/SvhoR9XAfflomGcOrOJl C8QFteRQCdAPFBzZXM0mcyKUMmRqYAIM0+6pr7Ficx0TPufly4jTp87+2d4haPBlwXWqtU wg8tpljMSpb+nfAp9pyRyje00Fewa8c= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177035; x=1711713035; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hKxUht4bI1oBBOH1u/ajoAci1RdtaLogFa2+swkKZCg=; b=ACkDroNn1FmCxfzgS/j16p6406vDEoohPGYPHvwVNc5yC56drK3vIps1 H4rEpL+YNCGV90BF3vJQiPQtjox0NZWpFrqXWs2dZa1AwRvE4lrwViZGE TpcjD5e1PLifac+btP2FIk7oz/ztcqAGBJgDXx5xfIsoG8A9+VW/KBk+G UgL3MAlkqgX19hV8IC7E+RhAnBTgk609RHKUxgSgh7X+x+J0YfQhbtizB /R74XpLlFU6DaU9JjNB0fu9lp5Htgf85YQK6SflRiU+OjcT3JAzk+IuK3 0hRZmCshrTmWFoLLWAYTay/WQ34FWyrWeLuiAWbxerbUoBMIKZIf1U/45 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="342756772" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="342756772" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634856509" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="634856509" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:17 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 1B89A104390; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Borislav Petkov Subject: [PATCHv9 05/14] efi/x86: Get full memory map in allocate_e820() Date: Thu, 30 Mar 2023 14:49:47 +0300 Message-Id: <20230330114956.20342-6-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 3F092180015 X-Rspamd-Server: rspam01 X-Stat-Signature: pxrcy689cy87esgxh49ke1idudpke66c X-HE-Tag: 1680177035-508195 X-HE-Meta: U2FsdGVkX19N+JolTedtpQR9nQAnBEcYcPmsegDdLOgXQteccy4Hv6FMJmwsHeUCLscSLFWamMwNQWZUp6aK5TAxS1wWAO94PMrxZeJwFOwpCZRu7j1EVewPS0uXkE5LGINcGA9J3f4fYbW97e+iVZIROosAIAua2E9YZWuSHM5ldKDut+x9fpD8JtyJAsC00IhxSHO52+TFE8zSDvhBX9hxsMc2M4morKPg35ocnyqydOYqeoWs8xo8ULb6HypLjruGxmPz1NncWh7gG4miaq0ytjfHDwN/1oIBVwvH6jkCW0HVHF96wRphcxz76X65ZgBLEz1S9A5BhSeRNu+yZVm6RVnHZFDWSoDwFRfQqS36Uce/0qlbrWj73AqXQV8VoJpOhx2LkQifjNFjrIDwC6imqv+rMGtsUyTzjAJiDoij94IQNsK2fu2wSj0t01g3V3lK/21auDjrEDuueK8485fXieMOiRA12toT8Ucn8e8whpOZU2IT8f0D0sDFEnk9MCfBdGnWLh3P8FpK+0FQKJGr/CZeQIkGEwuQCz2fPsBdUZkrGHETFcXjsfPFOWvp+Kd97OqHzuN6hPcwLKdax/QKS/MyAsAMqJzmDyRCRAeBRlDM2zRCL0WaY4RHdilp3o534AjQLQFSlRu0/O0gHOyUOFZUtQeg2x6fo1HmY5Iadpbp8ZryYNqigYRRRbns3sEwSBs6yxk33asv9Q6NHSxRoHzacKO6lOVNE/tAo8lOg5+t0DMaUSrLp14shRCk89hh4fxmRlQDONYaZhe7VX22bt5NQtJmTCxkEFtnqqh04fz37URSkuGL5vDlubNf/i9JZ1r8vf+POE3LLmluKEXXXBtsPTmu05MUhW+8U1e6GYm0yxXEh7BB6mObt7oL7PEN0teAu6NjXFOwNIBLbh4iJqIJ0csjuyDcXkmvD9ZS8WUe1HEWSFoWxZcv8zZOCldKU01gGnmYkiaRSjN 0uIUNuYJ qJF6DP4959F2x6I4GxG0KWNTk2PAwL37Hy743ogIwMlyGibOB3lm3d1CSFuo+pqmzB4S9JPL2cbbx1zXa71FSqB/1aV+e2Tvw8TUB/dKaCvSAnxW/Q4rI/dVmwHU3M5iz+6GInw20dK2WY8UxMxbaiEF13yOHdH5W0QshsCA+mEpJ9GPzuh2IZ7R3lA3VSeoQU7f7HPY5Jrc5JVY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently allocate_e820() is only interested in the size of map and size of memory descriptor to determine how many e820 entries the kernel needs. UEFI Specification version 2.9 introduces a new memory type -- unaccepted memory. To track unaccepted memory kernel needs to allocate a bitmap. The size of the bitmap is dependent on the maximum physical address present in the system. A full memory map is required to find the maximum address. Modify allocate_e820() to get a full memory map. Signed-off-by: Kirill A. Shutemov Reviewed-by: Borislav Petkov --- drivers/firmware/efi/libstub/x86-stub.c | 26 +++++++++++-------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index a0bfd31358ba..fff81843169c 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -681,28 +681,24 @@ static efi_status_t allocate_e820(struct boot_params *params, struct setup_data **e820ext, u32 *e820ext_size) { - unsigned long map_size, desc_size, map_key; + struct efi_boot_memmap *map; efi_status_t status; - __u32 nr_desc, desc_version; + __u32 nr_desc; - /* Only need the size of the mem map and size of each mem descriptor */ - map_size = 0; - status = efi_bs_call(get_memory_map, &map_size, NULL, &map_key, - &desc_size, &desc_version); - if (status != EFI_BUFFER_TOO_SMALL) - return (status != EFI_SUCCESS) ? status : EFI_UNSUPPORTED; - - nr_desc = map_size / desc_size + EFI_MMAP_NR_SLACK_SLOTS; + status = efi_get_memory_map(&map, false); + if (status != EFI_SUCCESS) + return status; - if (nr_desc > ARRAY_SIZE(params->e820_table)) { - u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table); + nr_desc = map->map_size / map->desc_size; + if (nr_desc > ARRAY_SIZE(params->e820_table) - EFI_MMAP_NR_SLACK_SLOTS) { + u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table) + + EFI_MMAP_NR_SLACK_SLOTS; status = alloc_e820ext(nr_e820ext, e820ext, e820ext_size); - if (status != EFI_SUCCESS) - return status; } - return EFI_SUCCESS; + efi_bs_call(free_pool, map); + return status; } struct exit_boot_struct { From patchwork Thu Mar 30 11:49:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194071 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B009FC6FD1D for ; Thu, 30 Mar 2023 11:50:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E99AD6B007B; Thu, 30 Mar 2023 07:50:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E22306B007D; Thu, 30 Mar 2023 07:50:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4CC56B007E; Thu, 30 Mar 2023 07:50:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B53226B007B for ; Thu, 30 Mar 2023 07:50:36 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 960FF160A6A for ; Thu, 30 Mar 2023 11:50:36 +0000 (UTC) X-FDA: 80625397272.01.A8C7435 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf16.hostedemail.com (Postfix) with ESMTP id 664EC180006 for ; Thu, 30 Mar 2023 11:50:33 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Dr+KchWY; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf16.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177033; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NsUoVw2/Wz8ct8GynZYL3aQMK/00A4pcE3JltZ1Es1c=; b=3eqPUuNtQpMaw0dEfcFC9CdgSgdrsy5b5EUxRtSv9y+FRN+blMHobDotsCvzWFic0G991e Lo4rZ+9Hh+zSSRt0Y7zJbx2HXGn1V8mx3U0bCdparyzz9jsuoBYCuDuzzaoUDnJrAXYon3 VgrSJnj7SIugW9eOO9JLf+ckeUuuzIE= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Dr+KchWY; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf16.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177033; a=rsa-sha256; cv=none; b=XMW6ooGsPmxdGt+VK8oHka/+anlH1/IV3kOh0bWzyash7+v1ryd9RpSlDAUzTIKtTjwXhL ywirNOfanMqRMIs+kuCS2JFoYIcTvEPx07FXDvT0VYyllkhlry1CMbvu2jD1tw8A5iLr/r COjvfzmREt4G4nPQIBBWGljGIuCpOEU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177033; x=1711713033; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SDxzXgjL7pEXr+tV7TRZsiH2QE2GbbDPicGBqQxLN7g=; b=Dr+KchWYCpsIQ0vHYY6kLW+ZGhjQ0zHBsZ+yXOpIsWsnG1wZpeMvJcpt pQKoLC8aYB0Xr//Yv+tX573EHerZhgRDltzVw1ZoIIBABUNA36YX74XoF KT+YYQpc5qzzkO87JQZWD0i5+DJmiBN4x5214uT9rWIgsCMYxb/muh+iG CU1w3nqaeris9aDOP6mdgQwO6ruQkQQcNjiZ+LkNoBbsXmnn3vcPGkIgm 4eOkeYdLf2fQw81wDw/K1OE/NJN/nOQqpt7P8K6Yy839QYZg5WRK76kaG OHYP7zNOvaZ5NN8aQ0UJfgYpTjIgDbJrMkQpDmx1fA5JQdGPvLNfEdbXI w==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="342756756" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="342756756" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634856505" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="634856505" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:16 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 276A3104391; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv9 06/14] x86/boot: Add infrastructure required for unaccepted memory support Date: Thu, 30 Mar 2023 14:49:48 +0300 Message-Id: <20230330114956.20342-7-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 664EC180006 X-Stat-Signature: g4jwpr3pwah459jdk31aamhm58qby3tn X-HE-Tag: 1680177033-554146 X-HE-Meta: U2FsdGVkX18MEdDj7GTqcoaQlWlefmpNLrJ4r+RwgNC4e5APmEUyNi9l2EiWky6yrpg7t1wYbLQirsD85DcUeB6WD4qb/MFmHDUU4UjCmcwS1RDcvS1OI8RF20WoQs9g9pQj8ipuZsp0chXkFl6VMfgiTqO8CylB5jdA40G15JU0cyJ9ptvajBwFzzLpl9zbmu+wNlid9Mfxnu6Q4cMrpt/8mZK4HZ/+cll6cMtNkthPxa5bXd0rsQrPpJV+h/jVEowrG0+qpbUAVemhkj/9c4iYP0Js38ceKlbmV0I6SaK0EKSUVwe3XLBf6rVyHzm5HtKKHNNJbx5rA0V2rTcCslV8nWVabn4c00swlEtM34wjy8oHEWLC5+jgX1CvzdqLVQP4KmK/+4z/H261G346KDyfCw1JI1Ghbkl9lXOVDmbJsJrN3pkF4IYopP1aMXI+gKP4NwybNRjLRTPtQv7d04jGCT+BHHO5FvHX5z9+kxJJXZ8sO0uNVQgNloaAYhapGXfyZ3UVkymFc2/5iuKpSgkL8PncgSrvrVxtXxzXN0O70ygrkrAG6dnVJqXNBEnmZLIKKlDBAKYmw/R0JgwOeTbbkMECIp/b2aT5ldfzkwkfkAl0h7/dbwzc9v4mVZtu3VjAkuYHODcoPYuTuwMmPmB2oALEuGz+OzcabQ+lZknvKoL1wolnXqREVH7pDGbPKyCV8LEzQSv90ldyo7Ps8gR+ozkfXzNHixkGz7xTfaJmmNI+uZdlWDRlJaVAjGAd+ufckhsf3h9h7k5WJkhSrWyeKg5l2YCmydf3ORonc/bV7ArOFt7AO/ZVsXasq1tx8kh3Agmg+xN9CTDtTgKljnIWi0SvuDJ+0pPpJpsNa3+rNC3xuiw6gIX8o7LjGNPjlr0Px/JWUJcq3esDGe5+jydbiMn3mrw+R7ecaIlKfEmTgfDz+WWk5iFAyzLRyApLP0zTMu8ewG90HAcbptN UfhJSTMh JD9PvCoEycazDKTvYY0nAOh13DLJ3mMHsv75rGTfykwBzQGxv7UcU0WUBJ623GKgGdfP2cgmVTc2PqDrRSFFnbCwwpvadsOfzqq8iwHnIM2vuzudpZuQfrkVbDzpUYycK2wC6HgYEwDnuR+Eskx9UhXwuI+drop+ygoth X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Pull functionality from the main kernel headers and lib/ that is required for unaccepted memory support. This is preparatory patch. The users for the functionality will come in following patches. Signed-off-by: Kirill A. Shutemov Reviewed-by: Borislav Petkov (AMD) --- arch/x86/boot/bitops.h | 40 ++++++++++++ arch/x86/boot/compressed/align.h | 14 +++++ arch/x86/boot/compressed/bitmap.c | 43 +++++++++++++ arch/x86/boot/compressed/bitmap.h | 49 +++++++++++++++ arch/x86/boot/compressed/bits.h | 36 +++++++++++ arch/x86/boot/compressed/find.c | 54 ++++++++++++++++ arch/x86/boot/compressed/find.h | 79 ++++++++++++++++++++++++ arch/x86/boot/compressed/math.h | 37 +++++++++++ arch/x86/boot/compressed/minmax.h | 61 ++++++++++++++++++ arch/x86/boot/compressed/pgtable_types.h | 25 ++++++++ 10 files changed, 438 insertions(+) create mode 100644 arch/x86/boot/compressed/align.h create mode 100644 arch/x86/boot/compressed/bitmap.c create mode 100644 arch/x86/boot/compressed/bitmap.h create mode 100644 arch/x86/boot/compressed/bits.h create mode 100644 arch/x86/boot/compressed/find.c create mode 100644 arch/x86/boot/compressed/find.h create mode 100644 arch/x86/boot/compressed/math.h create mode 100644 arch/x86/boot/compressed/minmax.h create mode 100644 arch/x86/boot/compressed/pgtable_types.h diff --git a/arch/x86/boot/bitops.h b/arch/x86/boot/bitops.h index 8518ae214c9b..38badf028543 100644 --- a/arch/x86/boot/bitops.h +++ b/arch/x86/boot/bitops.h @@ -41,4 +41,44 @@ static inline void set_bit(int nr, void *addr) asm("btsl %1,%0" : "+m" (*(u32 *)addr) : "Ir" (nr)); } +static __always_inline void __set_bit(long nr, volatile unsigned long *addr) +{ + asm volatile(__ASM_SIZE(bts) " %1,%0" : : "m" (*(volatile long *) addr), + "Ir" (nr) : "memory"); +} + +static __always_inline void __clear_bit(long nr, volatile unsigned long *addr) +{ + asm volatile(__ASM_SIZE(btr) " %1,%0" : : "m" (*(volatile long *) addr), + "Ir" (nr) : "memory"); +} + +/** + * __ffs - find first set bit in word + * @word: The word to search + * + * Undefined if no bit exists, so code should check against 0 first. + */ +static __always_inline unsigned long __ffs(unsigned long word) +{ + asm("rep; bsf %1,%0" + : "=r" (word) + : "rm" (word)); + return word; +} + +/** + * ffz - find first zero bit in word + * @word: The word to search + * + * Undefined if no zero exists, so code should check against ~0UL first. + */ +static __always_inline unsigned long ffz(unsigned long word) +{ + asm("rep; bsf %1,%0" + : "=r" (word) + : "r" (~word)); + return word; +} + #endif /* BOOT_BITOPS_H */ diff --git a/arch/x86/boot/compressed/align.h b/arch/x86/boot/compressed/align.h new file mode 100644 index 000000000000..7ccabbc5d1b8 --- /dev/null +++ b/arch/x86/boot/compressed/align.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_ALIGN_H +#define BOOT_ALIGN_H +#define _LINUX_ALIGN_H /* Inhibit inclusion of */ + +/* @a is a power of 2 value */ +#define ALIGN(x, a) __ALIGN_KERNEL((x), (a)) +#define ALIGN_DOWN(x, a) __ALIGN_KERNEL((x) - ((a) - 1), (a)) +#define __ALIGN_MASK(x, mask) __ALIGN_KERNEL_MASK((x), (mask)) +#define PTR_ALIGN(p, a) ((typeof(p))ALIGN((unsigned long)(p), (a))) +#define PTR_ALIGN_DOWN(p, a) ((typeof(p))ALIGN_DOWN((unsigned long)(p), (a))) +#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0) + +#endif diff --git a/arch/x86/boot/compressed/bitmap.c b/arch/x86/boot/compressed/bitmap.c new file mode 100644 index 000000000000..789ecadeb521 --- /dev/null +++ b/arch/x86/boot/compressed/bitmap.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "bitmap.h" + +void __bitmap_set(unsigned long *map, unsigned int start, int len) +{ + unsigned long *p = map + BIT_WORD(start); + const unsigned int size = start + len; + int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG); + unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start); + + while (len - bits_to_set >= 0) { + *p |= mask_to_set; + len -= bits_to_set; + bits_to_set = BITS_PER_LONG; + mask_to_set = ~0UL; + p++; + } + if (len) { + mask_to_set &= BITMAP_LAST_WORD_MASK(size); + *p |= mask_to_set; + } +} + +void __bitmap_clear(unsigned long *map, unsigned int start, int len) +{ + unsigned long *p = map + BIT_WORD(start); + const unsigned int size = start + len; + int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG); + unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start); + + while (len - bits_to_clear >= 0) { + *p &= ~mask_to_clear; + len -= bits_to_clear; + bits_to_clear = BITS_PER_LONG; + mask_to_clear = ~0UL; + p++; + } + if (len) { + mask_to_clear &= BITMAP_LAST_WORD_MASK(size); + *p &= ~mask_to_clear; + } +} diff --git a/arch/x86/boot/compressed/bitmap.h b/arch/x86/boot/compressed/bitmap.h new file mode 100644 index 000000000000..35357f5feda2 --- /dev/null +++ b/arch/x86/boot/compressed/bitmap.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_BITMAP_H +#define BOOT_BITMAP_H +#define __LINUX_BITMAP_H /* Inhibit inclusion of */ + +#include "../bitops.h" +#include "../string.h" +#include "align.h" + +#define BITMAP_MEM_ALIGNMENT 8 +#define BITMAP_MEM_MASK (BITMAP_MEM_ALIGNMENT - 1) + +#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1))) +#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1))) + +#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) + +void __bitmap_set(unsigned long *map, unsigned int start, int len); +void __bitmap_clear(unsigned long *map, unsigned int start, int len); + +static __always_inline void bitmap_set(unsigned long *map, unsigned int start, + unsigned int nbits) +{ + if (__builtin_constant_p(nbits) && nbits == 1) + __set_bit(start, map); + else if (__builtin_constant_p(start & BITMAP_MEM_MASK) && + IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) && + __builtin_constant_p(nbits & BITMAP_MEM_MASK) && + IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT)) + memset((char *)map + start / 8, 0xff, nbits / 8); + else + __bitmap_set(map, start, nbits); +} + +static __always_inline void bitmap_clear(unsigned long *map, unsigned int start, + unsigned int nbits) +{ + if (__builtin_constant_p(nbits) && nbits == 1) + __clear_bit(start, map); + else if (__builtin_constant_p(start & BITMAP_MEM_MASK) && + IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) && + __builtin_constant_p(nbits & BITMAP_MEM_MASK) && + IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT)) + memset((char *)map + start / 8, 0, nbits / 8); + else + __bitmap_clear(map, start, nbits); +} + +#endif diff --git a/arch/x86/boot/compressed/bits.h b/arch/x86/boot/compressed/bits.h new file mode 100644 index 000000000000..b0ffa007ee19 --- /dev/null +++ b/arch/x86/boot/compressed/bits.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_BITS_H +#define BOOT_BITS_H +#define __LINUX_BITS_H /* Inhibit inclusion of */ + +#ifdef __ASSEMBLY__ +#define _AC(X,Y) X +#define _AT(T,X) X +#else +#define __AC(X,Y) (X##Y) +#define _AC(X,Y) __AC(X,Y) +#define _AT(T,X) ((T)(X)) +#endif + +#define _UL(x) (_AC(x, UL)) +#define _ULL(x) (_AC(x, ULL)) +#define UL(x) (_UL(x)) +#define ULL(x) (_ULL(x)) + +#define BIT(nr) (UL(1) << (nr)) +#define BIT_ULL(nr) (ULL(1) << (nr)) +#define BIT_MASK(nr) (UL(1) << ((nr) % BITS_PER_LONG)) +#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) +#define BIT_ULL_MASK(nr) (ULL(1) << ((nr) % BITS_PER_LONG_LONG)) +#define BIT_ULL_WORD(nr) ((nr) / BITS_PER_LONG_LONG) +#define BITS_PER_BYTE 8 + +#define GENMASK(h, l) \ + (((~UL(0)) - (UL(1) << (l)) + 1) & \ + (~UL(0) >> (BITS_PER_LONG - 1 - (h)))) + +#define GENMASK_ULL(h, l) \ + (((~ULL(0)) - (ULL(1) << (l)) + 1) & \ + (~ULL(0) >> (BITS_PER_LONG_LONG - 1 - (h)))) + +#endif diff --git a/arch/x86/boot/compressed/find.c b/arch/x86/boot/compressed/find.c new file mode 100644 index 000000000000..b97a9e7c8085 --- /dev/null +++ b/arch/x86/boot/compressed/find.c @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include "bitmap.h" +#include "find.h" +#include "math.h" +#include "minmax.h" + +static __always_inline unsigned long swab(const unsigned long y) +{ +#if __BITS_PER_LONG == 64 + return __builtin_bswap32(y); +#else /* __BITS_PER_LONG == 32 */ + return __builtin_bswap64(y); +#endif +} + +unsigned long _find_next_bit(const unsigned long *addr1, + const unsigned long *addr2, unsigned long nbits, + unsigned long start, unsigned long invert, unsigned long le) +{ + unsigned long tmp, mask; + + if (start >= nbits) + return nbits; + + tmp = addr1[start / BITS_PER_LONG]; + if (addr2) + tmp &= addr2[start / BITS_PER_LONG]; + tmp ^= invert; + + /* Handle 1st word. */ + mask = BITMAP_FIRST_WORD_MASK(start); + if (le) + mask = swab(mask); + + tmp &= mask; + + start = round_down(start, BITS_PER_LONG); + + while (!tmp) { + start += BITS_PER_LONG; + if (start >= nbits) + return nbits; + + tmp = addr1[start / BITS_PER_LONG]; + if (addr2) + tmp &= addr2[start / BITS_PER_LONG]; + tmp ^= invert; + } + + if (le) + tmp = swab(tmp); + + return min(start + __ffs(tmp), nbits); +} diff --git a/arch/x86/boot/compressed/find.h b/arch/x86/boot/compressed/find.h new file mode 100644 index 000000000000..903574b9d57a --- /dev/null +++ b/arch/x86/boot/compressed/find.h @@ -0,0 +1,79 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_FIND_H +#define BOOT_FIND_H +#define __LINUX_FIND_H /* Inhibit inclusion of */ + +#include "../bitops.h" +#include "align.h" +#include "bits.h" + +unsigned long _find_next_bit(const unsigned long *addr1, + const unsigned long *addr2, unsigned long nbits, + unsigned long start, unsigned long invert, unsigned long le); + +/** + * find_next_bit - find the next set bit in a memory region + * @addr: The address to base the search on + * @offset: The bitnumber to start searching at + * @size: The bitmap size in bits + * + * Returns the bit number for the next set bit + * If no bits are set, returns @size. + */ +static inline +unsigned long find_next_bit(const unsigned long *addr, unsigned long size, + unsigned long offset) +{ + if (small_const_nbits(size)) { + unsigned long val; + + if (offset >= size) + return size; + + val = *addr & GENMASK(size - 1, offset); + return val ? __ffs(val) : size; + } + + return _find_next_bit(addr, NULL, size, offset, 0UL, 0); +} + +/** + * find_next_zero_bit - find the next cleared bit in a memory region + * @addr: The address to base the search on + * @offset: The bitnumber to start searching at + * @size: The bitmap size in bits + * + * Returns the bit number of the next zero bit + * If no bits are zero, returns @size. + */ +static inline +unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size, + unsigned long offset) +{ + if (small_const_nbits(size)) { + unsigned long val; + + if (offset >= size) + return size; + + val = *addr | ~GENMASK(size - 1, offset); + return val == ~0UL ? size : ffz(val); + } + + return _find_next_bit(addr, NULL, size, offset, ~0UL, 0); +} + +/** + * for_each_set_bitrange_from - iterate over all set bit ranges [b; e) + * @b: bit offset of start of current bitrange (first set bit); must be initialized + * @e: bit offset of end of current bitrange (first unset bit) + * @addr: bitmap address to base the search on + * @size: bitmap size in number of bits + */ +#define for_each_set_bitrange_from(b, e, addr, size) \ + for ((b) = find_next_bit((addr), (size), (b)), \ + (e) = find_next_zero_bit((addr), (size), (b) + 1); \ + (b) < (size); \ + (b) = find_next_bit((addr), (size), (e) + 1), \ + (e) = find_next_zero_bit((addr), (size), (b) + 1)) +#endif diff --git a/arch/x86/boot/compressed/math.h b/arch/x86/boot/compressed/math.h new file mode 100644 index 000000000000..f7eede84bbc2 --- /dev/null +++ b/arch/x86/boot/compressed/math.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_MATH_H +#define BOOT_MATH_H +#define __LINUX_MATH_H /* Inhibit inclusion of */ + +/* + * + * This looks more complex than it should be. But we need to + * get the type for the ~ right in round_down (it needs to be + * as wide as the result!), and we want to evaluate the macro + * arguments just once each. + */ +#define __round_mask(x, y) ((__typeof__(x))((y)-1)) + +/** + * round_up - round up to next specified power of 2 + * @x: the value to round + * @y: multiple to round up to (must be a power of 2) + * + * Rounds @x up to next multiple of @y (which must be a power of 2). + * To perform arbitrary rounding up, use roundup() below. + */ +#define round_up(x, y) ((((x)-1) | __round_mask(x, y))+1) + +/** + * round_down - round down to next specified power of 2 + * @x: the value to round + * @y: multiple to round down to (must be a power of 2) + * + * Rounds @x down to next multiple of @y (which must be a power of 2). + * To perform arbitrary rounding down, use rounddown() below. + */ +#define round_down(x, y) ((x) & ~__round_mask(x, y)) + +#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d)) + +#endif diff --git a/arch/x86/boot/compressed/minmax.h b/arch/x86/boot/compressed/minmax.h new file mode 100644 index 000000000000..4efd05673260 --- /dev/null +++ b/arch/x86/boot/compressed/minmax.h @@ -0,0 +1,61 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef BOOT_MINMAX_H +#define BOOT_MINMAX_H +#define __LINUX_MINMAX_H /* Inhibit inclusion of */ + +/* + * This returns a constant expression while determining if an argument is + * a constant expression, most importantly without evaluating the argument. + * Glory to Martin Uecker + */ +#define __is_constexpr(x) \ + (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8))) + +/* + * min()/max()/clamp() macros must accomplish three things: + * + * - avoid multiple evaluations of the arguments (so side-effects like + * "x++" happen only once) when non-constant. + * - perform strict type-checking (to generate warnings instead of + * nasty runtime surprises). See the "unnecessary" pointer comparison + * in __typecheck(). + * - retain result as a constant expressions when called with only + * constant expressions (to avoid tripping VLA warnings in stack + * allocation usage). + */ +#define __typecheck(x, y) \ + (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) + +#define __no_side_effects(x, y) \ + (__is_constexpr(x) && __is_constexpr(y)) + +#define __safe_cmp(x, y) \ + (__typecheck(x, y) && __no_side_effects(x, y)) + +#define __cmp(x, y, op) ((x) op (y) ? (x) : (y)) + +#define __cmp_once(x, y, unique_x, unique_y, op) ({ \ + typeof(x) unique_x = (x); \ + typeof(y) unique_y = (y); \ + __cmp(unique_x, unique_y, op); }) + +#define __careful_cmp(x, y, op) \ + __builtin_choose_expr(__safe_cmp(x, y), \ + __cmp(x, y, op), \ + __cmp_once(x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y), op)) + +/** + * min - return minimum of two values of the same or compatible types + * @x: first value + * @y: second value + */ +#define min(x, y) __careful_cmp(x, y, <) + +/** + * max - return maximum of two values of the same or compatible types + * @x: first value + * @y: second value + */ +#define max(x, y) __careful_cmp(x, y, >) + +#endif diff --git a/arch/x86/boot/compressed/pgtable_types.h b/arch/x86/boot/compressed/pgtable_types.h new file mode 100644 index 000000000000..8f1d87a69efc --- /dev/null +++ b/arch/x86/boot/compressed/pgtable_types.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef BOOT_COMPRESSED_PGTABLE_TYPES_H +#define BOOT_COMPRESSED_PGTABLE_TYPES_H +#define _ASM_X86_PGTABLE_DEFS_H /* Inhibit inclusion of */ + +#define PAGE_SHIFT 12 + +#ifdef CONFIG_X86_64 +#define PTE_SHIFT 9 +#elif defined CONFIG_X86_PAE +#define PTE_SHIFT 9 +#else /* 2-level */ +#define PTE_SHIFT 10 +#endif + +enum pg_level { + PG_LEVEL_NONE, + PG_LEVEL_4K, + PG_LEVEL_2M, + PG_LEVEL_1G, + PG_LEVEL_512G, + PG_LEVEL_NUM +}; + +#endif From patchwork Thu Mar 30 11:49:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194074 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1DE1C77B6F for ; Thu, 30 Mar 2023 11:50:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8F99D6B0082; Thu, 30 Mar 2023 07:50:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 88A436B0085; Thu, 30 Mar 2023 07:50:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B2A36B0083; Thu, 30 Mar 2023 07:50:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5C5486B0081 for ; Thu, 30 Mar 2023 07:50:39 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3B71B8095B for ; Thu, 30 Mar 2023 11:50:39 +0000 (UTC) X-FDA: 80625397398.10.4B3711A Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf11.hostedemail.com (Postfix) with ESMTP id 0320A40010 for ; Thu, 30 Mar 2023 11:50:36 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Fh6KoGXF; spf=none (imf11.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177037; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CHDV823rZOUn0+ZJELanQGogydDmakICM2zRSlXwzXs=; b=C0OQv3oHkB/jPuQJMU9GOMKJTv24lpUto2I1tG93r9xPglqLR0cluvMUgSYB4erR1CB360 oVfDI81e5yQGgMmoE/IQ/Webzs4QeDlNNVkkQwvR8zQ8sJWNKd6ao5I4nUXT+9SQyO8KKM GZNh/xN2FkKeDDw0QsIilYA2YI4c2TY= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Fh6KoGXF; spf=none (imf11.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177037; a=rsa-sha256; cv=none; b=X9WEtvrLv+MsFPAtY55Mjdoe667uy8OGRDLbaldmVKrCi3lFCIZ+ovB7mmmN2PtapvtPod eMfhT45SgD/ZvzM8kFWNvEZ6/KIsFPF/LiVJOS7RwDOda+L7dLuy+TAlW1oAWknEc7Pn9a fyPn0Pp6OiiAiDRCcz7mv6KoTdUxwzk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177037; x=1711713037; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CrAH2QrN2ySQ5GvkIHo7a3X9ICKho3MAnrqOOPhg1H8=; b=Fh6KoGXFPoEkVJ9iNgIq2QxDN3kCZhZ2H9BlOlBUMoBQOBixpqhzTIrh /+DP6makR2qDg1tL8dXCWjBx00w33S4cc1H6MDvOEmeXY55NTE73MmC9y f6SuU7EchhwIVk1mCEAsk4ND3ZW3PfSVS1ju4KjbMQ6M0Qz7fXs3St82G RCMkpiKO+AnakOizWUc2a8E1Rp/PaovIPYmLwFjqRuynb1mQMdUQO5Qtb X0dKxlEKiKLHrjP/uJxO387XZuB6xlDiyVMBE4UInZho3NhliwhhJfIPz DvK5nayjeqCfnpGpbuL5FBB3+s8utPUZl1GUnqaQwLEQMDKFD4BWFNAuc Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="339868456" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="339868456" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="1014401439" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="1014401439" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:12 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 3301A104454; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv9 07/14] efi/x86: Implement support for unaccepted memory Date: Thu, 30 Mar 2023 14:49:49 +0300 Message-Id: <20230330114956.20342-8-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 0320A40010 X-Rspam-User: X-Stat-Signature: a4kfi7mebksyckkffmbzyzha5fqrnboc X-HE-Tag: 1680177036-895438 X-HE-Meta: U2FsdGVkX18jJ0O//re/OPgbdqU3FIa0RE/wiC6JBpGHonbo96a7nKYHJ6M5cPST1uWHWlqljEfdSa1Sej0qYquztCkupGaFII5stFombFb5GZuhH47DaVoVrQ1ghd+wnxw834GE3FnMjm6FpG9t/9KsPflSE23TUKvYKqvjpYFPmGtc4yJqZA3lvtw/y+fuMrcHJ/ff9etCYk+vdRuYTwXMcpOGd9ph+eMzZhqoh1y6Q1I9RYHxDyKHGJNr2u1XI862mnRWyJxDWehm160PL7eKS+Tv/0e4GZFRPvpLIeGYfKGyimr6tqmZG4e+1ZuFOgQsBuZdqJfOVMUcH9uxIIERRgUoVopBLHotwCa654A4c7dPN0uc5edCFbZKhAygzD3BG1I2AWBZ4loCblywImmAZ2VORUS7TrxWJKvCt7srzUkN85a9dMEcKvIdqwSg9STcQsJTE7cL4Oh+zYogPa7LFJ3+JaaFwJg8+13oJWP6Na3ETMn7MnnC9UQkEgwknkv+EGVIpa41MKzpvQn1I/38SyYa6ja3SAVUNg6gEhMDUsxU/Wed6GoNrxAw0fg889Iw9fr8Ko4U0D8kAGe8yrby7EqwGPaMImA2jiat5YdFmP9I6B9YPgb94DihKt9AcXbeTKkuwTAWBLxGBKX7q1u8wlef9ep5F6/feyD3sKsLw9nFtdepcBYj9u9s4vip281HWRWi/WwftVKzjTvRMPZIzdqZQ5LwfVEqEpQaQA2WlmFqH0jdg9LFpyc0C8TkxOYWHfJbBDsgHKrEdL/R/TFML9vxYs0iWyvY1p4VDvXbfba7HqFg3LOTQl3pgyjn1z6ZBm2Gur88ruVY0LDkIL+0BXS0epHJmZk1JVK/g1ZwnqQ0gSPhM1JrvHmQLOxlkwc8jj9oRtGKDXy1XxgVSgzA/nD6M0d35m2TNT8fwdwzlmxjJfwAJiWjB1wAg5fOeRcpOVgx34F5OJd9SiS nI0OFFc3 SXrz9nCLMj1Qc1Kf7ncS1Hq2U3eU2nSITrKRicvxne8V8mcTT2tDIcX4MzlJK++LfeZBOMtRXHg4b5m0msWN7x/m9oYYIdzyDuGJkKS9iurS9KlNF/EZbHTNcLcGNo1/QOi0MlLToOUYWmpuwGJxBWJ3eTXmGspWir6rV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: UEFI Specification version 2.9 introduces the concept of memory acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, requiring memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific for the Virtual Machine platform. Accepting memory is costly and it makes VMM allocate memory for the accepted guest physical address range. It's better to postpone memory acceptance until memory is needed. It lowers boot time and reduces memory overhead. The kernel needs to know what memory has been accepted. Firmware communicates this information via memory map: a new memory type -- EFI_UNACCEPTED_MEMORY -- indicates such memory. Range-based tracking works fine for firmware, but it gets bulky for the kernel: e820 has to be modified on every page acceptance. It leads to table fragmentation, but there's a limited number of entries in the e820 table Another option is to mark such memory as usable in e820 and track if the range has been accepted in a bitmap. One bit in the bitmap represents 2MiB in the address space: one 4k page is enough to track 64GiB or physical address space. In the worst-case scenario -- a huge hole in the middle of the address space -- It needs 256MiB to handle 4PiB of the address space. Any unaccepted memory that is not aligned to 2M gets accepted upfront. The bitmap is allocated and constructed in the EFI stub and passed down to the kernel via boot_params. allocate_e820() allocates the bitmap if unaccepted memory is present, according to the maximum address in the memory map. Signed-off-by: Kirill A. Shutemov --- Documentation/x86/zero-page.rst | 1 + arch/x86/boot/compressed/Makefile | 1 + arch/x86/boot/compressed/mem.c | 73 ++++++++++++++++++++++++ arch/x86/include/asm/unaccepted_memory.h | 10 ++++ arch/x86/include/uapi/asm/bootparam.h | 2 +- drivers/firmware/efi/Kconfig | 14 +++++ drivers/firmware/efi/efi.c | 1 + drivers/firmware/efi/libstub/x86-stub.c | 65 +++++++++++++++++++++ include/linux/efi.h | 3 +- 9 files changed, 168 insertions(+), 2 deletions(-) create mode 100644 arch/x86/boot/compressed/mem.c create mode 100644 arch/x86/include/asm/unaccepted_memory.h diff --git a/Documentation/x86/zero-page.rst b/Documentation/x86/zero-page.rst index 45aa9cceb4f1..f21905e61ade 100644 --- a/Documentation/x86/zero-page.rst +++ b/Documentation/x86/zero-page.rst @@ -20,6 +20,7 @@ Offset/Size Proto Name Meaning 060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information (struct ist_info) 070/008 ALL acpi_rsdp_addr Physical address of ACPI RSDP table +078/008 ALL unaccepted_memory Bitmap of unaccepted memory (1bit == 2M) 080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!! 090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!! 0A0/010 ALL sys_desc_table System description table (struct sys_desc_table), diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 6b6cfe607bdb..f62c02348f9a 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -107,6 +107,7 @@ endif vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdcall.o +vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/bitmap.o $(obj)/mem.o vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_mixed.o diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c new file mode 100644 index 000000000000..6b15a0ed8b54 --- /dev/null +++ b/arch/x86/boot/compressed/mem.c @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "../cpuflags.h" +#include "bitmap.h" +#include "error.h" +#include "math.h" + +#define PMD_SHIFT 21 +#define PMD_SIZE (_AC(1, UL) << PMD_SHIFT) +#define PMD_MASK (~(PMD_SIZE - 1)) + +static inline void __accept_memory(phys_addr_t start, phys_addr_t end) +{ + /* Platform-specific memory-acceptance call goes here */ + error("Cannot accept memory"); +} + +/* + * The accepted memory bitmap only works at PMD_SIZE granularity. Take + * unaligned start/end addresses and either: + * 1. Accepts the memory immediately and in its entirety + * 2. Accepts unaligned parts, and marks *some* aligned part unaccepted + * + * The function will never reach the bitmap_set() with zero bits to set. + */ +void process_unaccepted_memory(struct boot_params *params, u64 start, u64 end) +{ + /* + * Ensure that at least one bit will be set in the bitmap by + * immediately accepting all regions under 2*PMD_SIZE. This is + * imprecise and may immediately accept some areas that could + * have been represented in the bitmap. But, results in simpler + * code below + * + * Consider case like this: + * + * | 4k | 2044k | 2048k | + * ^ 0x0 ^ 2MB ^ 4MB + * + * Only the first 4k has been accepted. The 0MB->2MB region can not be + * represented in the bitmap. The 2MB->4MB region can be represented in + * the bitmap. But, the 0MB->4MB region is <2*PMD_SIZE and will be + * immediately accepted in its entirety. + */ + if (end - start < 2 * PMD_SIZE) { + __accept_memory(start, end); + return; + } + + /* + * No matter how the start and end are aligned, at least one unaccepted + * PMD_SIZE area will remain to be marked in the bitmap. + */ + + /* Immediately accept a unaccepted_memory, + start / PMD_SIZE, (end - start) / PMD_SIZE); +} diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/asm/unaccepted_memory.h new file mode 100644 index 000000000000..df0736d32858 --- /dev/null +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (C) 2020 Intel Corporation */ +#ifndef _ASM_X86_UNACCEPTED_MEMORY_H +#define _ASM_X86_UNACCEPTED_MEMORY_H + +struct boot_params; + +void process_unaccepted_memory(struct boot_params *params, u64 start, u64 num); + +#endif diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h index 01d19fc22346..630a54046af0 100644 --- a/arch/x86/include/uapi/asm/bootparam.h +++ b/arch/x86/include/uapi/asm/bootparam.h @@ -189,7 +189,7 @@ struct boot_params { __u64 tboot_addr; /* 0x058 */ struct ist_info ist_info; /* 0x060 */ __u64 acpi_rsdp_addr; /* 0x070 */ - __u8 _pad3[8]; /* 0x078 */ + __u64 unaccepted_memory; /* 0x078 */ __u8 hd0_info[16]; /* obsolete! */ /* 0x080 */ __u8 hd1_info[16]; /* obsolete! */ /* 0x090 */ struct sys_desc_table sys_desc_table; /* obsolete! */ /* 0x0a0 */ diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig index 043ca31c114e..231f1c70d1db 100644 --- a/drivers/firmware/efi/Kconfig +++ b/drivers/firmware/efi/Kconfig @@ -269,6 +269,20 @@ config EFI_COCO_SECRET virt/coco/efi_secret module to access the secrets, which in turn allows userspace programs to access the injected secrets. +config UNACCEPTED_MEMORY + bool + depends on EFI_STUB + help + Some Virtual Machine platforms, such as Intel TDX, require + some memory to be "accepted" by the guest before it can be used. + This mechanism helps prevent malicious hosts from making changes + to guest memory. + + UEFI specification v2.9 introduced EFI_UNACCEPTED_MEMORY memory type. + + This option adds support for unaccepted memory and makes such memory + usable by the kernel. + config EFI_EMBEDDED_FIRMWARE bool select CRYPTO_LIB_SHA256 diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index abeff7dc0b58..7dce06e419c5 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -843,6 +843,7 @@ static __initdata char memory_type_name[][13] = { "MMIO Port", "PAL Code", "Persistent", + "Unaccepted", }; char * __init efi_md_typeattr_format(char *buf, size_t size, diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index fff81843169c..1643ddbde249 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "efistub.h" @@ -613,6 +614,16 @@ setup_e820(struct boot_params *params, struct setup_data *e820ext, u32 e820ext_s e820_type = E820_TYPE_PMEM; break; + case EFI_UNACCEPTED_MEMORY: + if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) { + efi_warn_once( +"The system has unaccepted memory, but kernel does not support it\nConsider enabling CONFIG_UNACCEPTED_MEMORY\n"); + continue; + } + e820_type = E820_TYPE_RAM; + process_unaccepted_memory(params, d->phys_addr, + d->phys_addr + PAGE_SIZE * d->num_pages); + break; default: continue; } @@ -677,6 +688,57 @@ static efi_status_t alloc_e820ext(u32 nr_desc, struct setup_data **e820ext, return status; } +static efi_status_t allocate_unaccepted_bitmap(struct boot_params *params, + __u32 nr_desc, + struct efi_boot_memmap *map) +{ + unsigned long *mem = NULL; + u64 size, max_addr = 0; + efi_status_t status; + bool found = false; + int i; + + /* Check if there's any unaccepted memory and find the max address */ + for (i = 0; i < nr_desc; i++) { + efi_memory_desc_t *d; + unsigned long m = (unsigned long)map->map; + + d = efi_early_memdesc_ptr(m, map->desc_size, i); + if (d->type == EFI_UNACCEPTED_MEMORY) + found = true; + if (d->phys_addr + d->num_pages * PAGE_SIZE > max_addr) + max_addr = d->phys_addr + d->num_pages * PAGE_SIZE; + } + + if (!found) { + params->unaccepted_memory = 0; + return EFI_SUCCESS; + } + + /* + * If unaccepted memory is present, allocate a bitmap to track what + * memory has to be accepted before access. + * + * One bit in the bitmap represents 2MiB in the address space: + * A 4k bitmap can track 64GiB of physical address space. + * + * In the worst case scenario -- a huge hole in the middle of the + * address space -- It needs 256MiB to handle 4PiB of the address + * space. + * + * The bitmap will be populated in setup_e820() according to the memory + * map after efi_exit_boot_services(). + */ + size = DIV_ROUND_UP(max_addr, PMD_SIZE * BITS_PER_BYTE); + status = efi_allocate_pages(size, (unsigned long *)&mem, ULONG_MAX); + if (status == EFI_SUCCESS) { + memset(mem, 0, size); + params->unaccepted_memory = (unsigned long)mem; + } + + return status; +} + static efi_status_t allocate_e820(struct boot_params *params, struct setup_data **e820ext, u32 *e820ext_size) @@ -697,6 +759,9 @@ static efi_status_t allocate_e820(struct boot_params *params, status = alloc_e820ext(nr_e820ext, e820ext, e820ext_size); } + if (IS_ENABLED(CONFIG_UNACCEPTED_MEMORY) && status == EFI_SUCCESS) + status = allocate_unaccepted_bitmap(params, nr_desc, map); + efi_bs_call(free_pool, map); return status; } diff --git a/include/linux/efi.h b/include/linux/efi.h index 04a733f0ba95..1d4f0343c710 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -108,7 +108,8 @@ typedef struct { #define EFI_MEMORY_MAPPED_IO_PORT_SPACE 12 #define EFI_PAL_CODE 13 #define EFI_PERSISTENT_MEMORY 14 -#define EFI_MAX_MEMORY_TYPE 15 +#define EFI_UNACCEPTED_MEMORY 15 +#define EFI_MAX_MEMORY_TYPE 16 /* Attribute values: */ #define EFI_MEMORY_UC ((u64)0x0000000000000001ULL) /* uncached */ From patchwork Thu Mar 30 11:49:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194070 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DD8AC761A6 for ; Thu, 30 Mar 2023 11:50:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7E3986B007D; Thu, 30 Mar 2023 07:50:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6F5856B007E; Thu, 30 Mar 2023 07:50:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 570326B0080; Thu, 30 Mar 2023 07:50:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 498626B007D for ; Thu, 30 Mar 2023 07:50:37 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 26144C0EB9 for ; Thu, 30 Mar 2023 11:50:37 +0000 (UTC) X-FDA: 80625397314.25.99DB0E3 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf03.hostedemail.com (Postfix) with ESMTP id E36DE20013 for ; Thu, 30 Mar 2023 11:50:34 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=cMDkRYFL; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf03.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177035; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sJ0jWyXBmb0hmPD/qh9m7QTKFLzmCPwA0IjlFxXM8+0=; b=f/8OQvKtA/trFt4OOD1/EFPwe97UeM+kQcCZGS6We/nNXar7kf7A78f8j6yJDVT56RuO2T 0fibG9Ii1l11vYZ6fV1uO2xYwl2ORGqbDanxb1F6YV2L9VRsRWeQIa8vMWj/NU8zwbj7QJ HT2I8qUX02WHDdq68NZn0pjZ4+UQ2w0= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=cMDkRYFL; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf03.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177035; a=rsa-sha256; cv=none; b=BiBlsYwCO0oRmdrB9klTyB9sg8A8APOmXXWEX99kZH8H+bBE/d46VtYR8JVu5RHk8siqQ4 cw/6I2vll5079W+gu6SQXyME2vgUcdfs+JbDZSV9cCmfWeTa5P4kySAgyocIrKss1WpLAj yj0lBBUZRf9fQLHTQbykcW4ZKFEmTUs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177035; x=1711713035; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=d4/KfDFRLyY7sMUS/zMPpAEAi7ubl+P+2mX+6hzyiNg=; b=cMDkRYFLqg5j606YT775ZoLz4l/T6gZVOcTFhrIACwt3eFvp9jWLvAGz DEiC7OktV7/z9gBO4z4A7wiysxG6xZE7NMWOfpvvoBve77zGGqcllqrMv HjS7sZDiSSOUHMw5TLIHzMxNnIUdA3AkkRyCNvC4WTcdvCDa5rS45ZF0Y WZ3xaRTA7s61ad4QMiR7NJADGqqK47YvYH2isCe/xPnhrLigiITPVT+z2 MBdNNVZeLoQ9O3wGRT4XtIFdN/XPlKF6qqQ1bHns3D/hbsbad7HS2k09H 9hoFjFrQy8D4jZuKLk+h4HJJ9+piF4NKvqIlc0pgXs5uhpeuepI43QmtZ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="342756780" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="342756780" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634856504" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="634856504" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:16 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 3E8F41044F3; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv9 08/14] x86/boot/compressed: Handle unaccepted memory Date: Thu, 30 Mar 2023 14:49:50 +0300 Message-Id: <20230330114956.20342-9-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: E36DE20013 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: r99fcxxayqqhyfwr8j6a14exkpocthe9 X-HE-Tag: 1680177034-372919 X-HE-Meta: U2FsdGVkX1+CqChN7bqNWmrtmBxr8sHUacCmyJanLvwY8vkMznzyl/wNunW6xLC+yJhTD/MIyRKQ4CSLldzGdpFF7GP6ugrqIJmofZ55k0dVOTVo/HOp0snhOl9TESg1rbfS7Cu5W38IqOFr6Qv9ByaIPiFwrZeMEMx/3R5oEfuualYLuJS70zDZY7ES7eVboH2ddHimQc+krW2IxpftJHaWAsOrXpa5ttAQwWXu1PE2tjj0lRov8loY+9Lx1Kp2z8SEkiPqeYRWIVPoVC0ghrzGZF8TyrBukkOX2eOGKmxD22wbupR3nXY+n3HZTj+XLT9nmu4R0n1pcORY5rg+pnqD/nEGSVD2Yvrh7ymbJ+xiJkHMncbKgx9wSysZhVr/dQAW7MU3yoalKPu1lYJyat5KRZqXfTiWQlJnncYBX2auVYAglXv6HxEWTM7EwIniWjT/b3o5sueeRYki5zI+aMjrX/arkuUu3x3TpiBXhF21kIgbvoKbaklbOkWuVcwkRe0kIDbbmOKQCrU1sWIRzmTdvQIOOYIHL25j5UHfgytNlZjCLDzyWbk6nRLMbQIygBn4lCRCqemgu7nEB9Q4K9p3nRzPei5V8GhCFjgz8w3uIimGyQ1sQXwdIxlLV7+ivVDoOCcxLCST22yysmyqMZOQpvD55VpUmtwv4A8ci/AkmAWQS8PeU/M3Y7ppPBOSAv4bmga5GheWMpb9SwWsPyebh1kKo5jOsvTLw8sY7zBEc9co39/kXWR+AivIu//jj24e2BgpmBrerfjx9DV4vuy6mlHTFjypdvkQDU5RAC4+BSGV8cOZz5Hferz3GxbfiUzXC3WPUq7G37UE+G4VkvGsP412FvTGlcWCcG66iLvjkYFFUFeUuWhF1z2a3f1SfjneNZ9bSTQ92tBo7f/yb7jV5CkQCoyZ85NgxnEtvSgpkEpALvonBm+8KSKE0POGu0k6NxvyWD/sako4+xd gwcI6G7K mw3bJNp9zAZz6xWB+Pje7Q31Tu5lKWKOi6o3tXCHKJxLddIBZnzVzmhNqN+sy4N1GO2HyBQHoKDfZa2+p6szbPculr7eJr7nm9UDXotu+mUD3dP9JLWPJovP6yY845KexykjvntJ5LR4kGAhTGuTj5+EwFwWD++F5u335 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The firmware will pre-accept the memory used to run the stub. But, the stub is responsible for accepting the memory into which it decompresses the main kernel. Accept memory just before decompression starts. The stub is also responsible for choosing a physical address in which to place the decompressed kernel image. The KASLR mechanism will randomize this physical address. Since the unaccepted memory region is relatively small, KASLR would be quite ineffective if it only used the pre-accepted area (EFI_CONVENTIONAL_MEMORY). Ensure that KASLR randomizes among the entire physical address space by also including EFI_UNACCEPTED_MEMORY. Signed-off-by: Kirill A. Shutemov --- arch/x86/boot/compressed/Makefile | 2 +- arch/x86/boot/compressed/efi.h | 1 + arch/x86/boot/compressed/kaslr.c | 35 ++++++++++++++++-------- arch/x86/boot/compressed/mem.c | 18 ++++++++++++ arch/x86/boot/compressed/misc.c | 6 ++++ arch/x86/boot/compressed/misc.h | 6 ++++ arch/x86/include/asm/unaccepted_memory.h | 2 ++ 7 files changed, 57 insertions(+), 13 deletions(-) diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index f62c02348f9a..74f7adee46ad 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -107,7 +107,7 @@ endif vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdcall.o -vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/bitmap.o $(obj)/mem.o +vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/bitmap.o $(obj)/find.o $(obj)/mem.o vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_mixed.o diff --git a/arch/x86/boot/compressed/efi.h b/arch/x86/boot/compressed/efi.h index 7db2f41b54cd..cf475243b6d5 100644 --- a/arch/x86/boot/compressed/efi.h +++ b/arch/x86/boot/compressed/efi.h @@ -32,6 +32,7 @@ typedef struct { } efi_table_hdr_t; #define EFI_CONVENTIONAL_MEMORY 7 +#define EFI_UNACCEPTED_MEMORY 15 #define EFI_MEMORY_MORE_RELIABLE \ ((u64)0x0000000000010000ULL) /* higher reliability */ diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c index 454757fbdfe5..749f0fe7e446 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -672,6 +672,28 @@ static bool process_mem_region(struct mem_vector *region, } #ifdef CONFIG_EFI + +/* + * Only EFI_CONVENTIONAL_MEMORY and EFI_UNACCEPTED_MEMORY (if supported) are + * guaranteed to be free. + * + * It is more conservative in picking free memory than the EFI spec allows: + * + * According to the spec, EFI_BOOT_SERVICES_{CODE|DATA} are also free memory + * and thus available to place the kernel image into, but in practice there's + * firmware where using that memory leads to crashes. + */ +static inline bool memory_type_is_free(efi_memory_desc_t *md) +{ + if (md->type == EFI_CONVENTIONAL_MEMORY) + return true; + + if (md->type == EFI_UNACCEPTED_MEMORY) + return IS_ENABLED(CONFIG_UNACCEPTED_MEMORY); + + return false; +} + /* * Returns true if we processed the EFI memmap, which we prefer over the E820 * table if it is available. @@ -716,18 +738,7 @@ process_efi_entries(unsigned long minimum, unsigned long image_size) for (i = 0; i < nr_desc; i++) { md = efi_early_memdesc_ptr(pmap, e->efi_memdesc_size, i); - /* - * Here we are more conservative in picking free memory than - * the EFI spec allows: - * - * According to the spec, EFI_BOOT_SERVICES_{CODE|DATA} are also - * free memory and thus available to place the kernel image into, - * but in practice there's firmware where using that memory leads - * to crashes. - * - * Only EFI_CONVENTIONAL_MEMORY is guaranteed to be free. - */ - if (md->type != EFI_CONVENTIONAL_MEMORY) + if (!memory_type_is_free(md)) continue; if (efi_soft_reserve_enabled() && diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c index 6b15a0ed8b54..de858a5180b6 100644 --- a/arch/x86/boot/compressed/mem.c +++ b/arch/x86/boot/compressed/mem.c @@ -3,12 +3,15 @@ #include "../cpuflags.h" #include "bitmap.h" #include "error.h" +#include "find.h" #include "math.h" #define PMD_SHIFT 21 #define PMD_SIZE (_AC(1, UL) << PMD_SHIFT) #define PMD_MASK (~(PMD_SIZE - 1)) +extern struct boot_params *boot_params; + static inline void __accept_memory(phys_addr_t start, phys_addr_t end) { /* Platform-specific memory-acceptance call goes here */ @@ -71,3 +74,18 @@ void process_unaccepted_memory(struct boot_params *params, u64 start, u64 end) bitmap_set((unsigned long *)params->unaccepted_memory, start / PMD_SIZE, (end - start) / PMD_SIZE); } + +void accept_memory(phys_addr_t start, phys_addr_t end) +{ + unsigned long range_start, range_end; + unsigned long *bitmap, bitmap_size; + + bitmap = (unsigned long *)boot_params->unaccepted_memory; + range_start = start / PMD_SIZE; + bitmap_size = DIV_ROUND_UP(end, PMD_SIZE); + + for_each_set_bitrange_from(range_start, range_end, bitmap, bitmap_size) { + __accept_memory(range_start * PMD_SIZE, range_end * PMD_SIZE); + bitmap_clear(bitmap, range_start, range_end - range_start); + } +} diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index 014ff222bf4b..186bfd53e042 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -455,6 +455,12 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap, #endif debug_putstr("\nDecompressing Linux... "); + + if (boot_params->unaccepted_memory) { + debug_putstr("Accepting memory... "); + accept_memory(__pa(output), __pa(output) + needed_size); + } + __decompress(input_data, input_len, NULL, NULL, output, output_len, NULL, error); entry_offset = parse_elf(output); diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index 2f155a0e3041..9663d1839f54 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -247,4 +247,10 @@ static inline unsigned long efi_find_vendor_table(struct boot_params *bp, } #endif /* CONFIG_EFI */ +#ifdef CONFIG_UNACCEPTED_MEMORY +void accept_memory(phys_addr_t start, phys_addr_t end); +#else +static inline void accept_memory(phys_addr_t start, phys_addr_t end) {} +#endif + #endif /* BOOT_COMPRESSED_MISC_H */ diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/asm/unaccepted_memory.h index df0736d32858..41fbfc798100 100644 --- a/arch/x86/include/asm/unaccepted_memory.h +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -7,4 +7,6 @@ struct boot_params; void process_unaccepted_memory(struct boot_params *params, u64 start, u64 num); +void accept_memory(phys_addr_t start, phys_addr_t end); + #endif From patchwork Thu Mar 30 11:49:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194076 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D96BC761A6 for ; Thu, 30 Mar 2023 11:50:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D888E6B0085; Thu, 30 Mar 2023 07:50:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D3BB86B0088; Thu, 30 Mar 2023 07:50:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B63A06B0087; Thu, 30 Mar 2023 07:50:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9A45F6B0083 for ; Thu, 30 Mar 2023 07:50:40 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 61DA61C65B4 for ; Thu, 30 Mar 2023 11:50:40 +0000 (UTC) X-FDA: 80625397440.29.6452712 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id 47E80C000B for ; Thu, 30 Mar 2023 11:50:38 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=hhOGEnxA; spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177038; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CDsQGJ5mVHwKQTSqTirUAmTn1KY5xyP0qi9c6MOAeR0=; b=5OKePyWNg3n9CxZJQj4eReXVsDzbklJkDKlKAMQFAGN9aknCbxdtI9eXVAH1c8zrTJ7xbv mb4d6PB/bka15zW7O6D+a57KWQLFv8cvRJdPtDq+2qR2KV49gqOmvF2pm9kc2y1V7HRXm0 tqryxZXX8pJjNzjM+RXFUQathyam1Kw= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=hhOGEnxA; spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177038; a=rsa-sha256; cv=none; b=MwOsAu6oHljTZA625pyQrqnOL4kE4KtISlyeMMMY+E2jpS5gpAkseGNMyoxySlanzkmrQc 5etDl9pBhFWPXO4W0sT1ZoM9btSe+OW/pJT6Kj0TdmeovQnY4zk0VxCkx7O3MJW370Q8EY qa5mWvQ/bBhWhSXEvmhsUMWuzDik/kM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177038; x=1711713038; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=f7/hrgLyEe7uCgqzPwQBAiXJTS6cTMMnuaKLYQlQP9E=; b=hhOGEnxAPBygVCkJZvrxEbuW02eMt60pJNw3LbLcv1yv6Cbzvyr9EVVl JmabqcK/rl5DCEmCDx0WXMmbUqSVErIQiF8goH8GDQHESkhih9/sGEoZ0 Zp4w63VuamvHD8NjJzmUTaZ/Z78orpci0gSHPvz0uxAZHLUQmyOD6sy5h /aYx2vGlzgNZy9b1d6KRVjW0WyUaVGnxVEMRBtKParpGf532o5tqXXZ6Z qgb+8PkFl2J6UMaBIPA0RR7Lmrj8/cKmpnX6eif6vRJnq+PJlzfxrtoDd 2Dp8F0cu5HqW8EU0wfQuY2Ye/cusjBKgFpgx/yfWlOel0++eUpDmck6Aj w==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="339868488" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="339868488" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="1014401448" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="1014401448" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:17 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 49CA4104545; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Mike Rapoport Subject: [PATCHv9 09/14] x86/mm: Reserve unaccepted memory bitmap Date: Thu, 30 Mar 2023 14:49:51 +0300 Message-Id: <20230330114956.20342-10-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: zsaaq3mei53bnqnfxca75uhytn39mmfp X-Rspamd-Queue-Id: 47E80C000B X-HE-Tag: 1680177038-891458 X-HE-Meta: U2FsdGVkX1+ZiHx5QvFT9Qk+gAg+7a0QKDMNXa6+DdMiPuKx9RTv+8YE1f3wFJZt+lXnLj16emjhRCXR3l5DIry3jMzvcH1GNPNmoJoYmiPbWYgfDa2QZdA5GDUrEiqjJr15obyXpL2NCaergIwhsPq6/IvPMWrqxokimt6vdxXxx6cE9nrAzfjr4sCaUrFaKocTNKhlvDCx/jnKtoLSGY+U0lqhB/ZOZX9l6ZKrHt34MhX4/pPGdrRdrq7PCbjp2TiqRA6YxZV323P4UTNoXwiheY8rpgKSp90H+B6U133TwNzlLj0Ue6uNxZ/Voskuqx6cz0puq5nusFfvTPrY+McHKzQ1Wu+9lDt3VrOoFuDun2NQHYq2F+bsfJmNDS1O72BeViP2FY9/cmdao8Q89v98VIVpR0GFvJGiXKzTo97b1oaDkS4fVjaLKW9vkaRCRdPvl3yuctahSOM6gQ9jRoBl7z08HabLdp8qKBO6j9yYujfJ8F/BqAu//RuBRNPenf686ckLkgUaBoCyfwnzQ2ZOwn7k/f8jfSEdt8xFdz53IRLarIL9C70c45W41RN1sx6wWm5HdZt4JEvT1GW1J54ciW3+oHM4ZbKxpEItWuzS3YPllew8c6LMwalyeczbWBiGsDkxwagUnugeS0dXQvWeJLLDwxNu4jYdOERN0h94i3vYFD9rk5C3QuW1ryXHECDuusDqqnrsKOPMY1qkI6sLv+/IqJhXbrUXpT4iDFIxCLWYe0Uqv0+mY7ZmtNgThqn3s0ayLWkFVjppW4pvtMOzza8bFy8rBaP2f8otVZ6wxkXuBgL2/jdFnW4V1oEDN5ld5bZvb477QMa8KbyxWa5wGMQYdjgP8JztLeH2nOd8rzVUkBez2A1Am4vDm+T+rpCan0Cd4z3NN1V+VFXeoQs0bx5wogBbyjH6b7KognO3rLlclVkMEmufG6rmzLyOYVGQCLBufuBIUYlLren q1BqbCWh knNj/77pbS7hd6CeNohYYv4ifJyPn5reQ0GUy6g6qdi/8kFgymrrHCRf83sKaPCKpcg8tnueiwplIC93AWmMpfSI3QqDjNlgVwHFnRDmlXwkB1sjRfJJwmqhxM33KXqwjh9hFPGNTTwZBIboyfr3kImBuvMI70lZoK7eEsd4ZO5Qk+Fe9WvXaQb8Bhw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A given page of memory can only be accepted once. The kernel has to accept memory both in the early decompression stage and during normal runtime. A bitmap is used to communicate the acceptance state of each page between the decompression stage and normal runtime. boot_params is used to communicate location of the bitmap throughout the boot. The bitmap is allocated and initially populated in EFI stub. Decompression stage accepts pages required for kernel/initrd and marks these pages accordingly in the bitmap. The main kernel picks up the bitmap from the same boot_params and uses it to determine what has to be accepted on allocation. In the runtime kernel, reserve the bitmap's memory to ensure nothing overwrites it. The size of bitmap is determined with e820__end_of_ram_pfn() which relies on setup_e820() marking unaccepted memory as E820_TYPE_RAM. Signed-off-by: Kirill A. Shutemov Acked-by: Mike Rapoport --- arch/x86/kernel/e820.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index fb8cf953380d..483c36a28d2e 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -1316,6 +1316,23 @@ void __init e820__memblock_setup(void) int i; u64 end; + /* + * Mark unaccepted memory bitmap reserved. + * + * This kind of reservation usually done from early_reserve_memory(), + * but early_reserve_memory() called before e820__memory_setup(), so + * e820_table is not finalized and e820__end_of_ram_pfn() cannot be + * used to get correct RAM size. + */ + if (boot_params.unaccepted_memory) { + unsigned long size; + + /* One bit per 2MB */ + size = DIV_ROUND_UP(e820__end_of_ram_pfn() * PAGE_SIZE, + PMD_SIZE * BITS_PER_BYTE); + memblock_reserve(boot_params.unaccepted_memory, size); + } + /* * The bootstrap memblock region count maximum is 128 entries * (INIT_MEMBLOCK_REGIONS), but EFI might pass us more E820 entries From patchwork Thu Mar 30 11:49:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194078 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12A3AC761A6 for ; Thu, 30 Mar 2023 11:50:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B7BD6B0087; Thu, 30 Mar 2023 07:50:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9671D6B0088; Thu, 30 Mar 2023 07:50:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 795496B0089; Thu, 30 Mar 2023 07:50:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 65AD76B0087 for ; Thu, 30 Mar 2023 07:50:41 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3F4431C64B8 for ; Thu, 30 Mar 2023 11:50:41 +0000 (UTC) X-FDA: 80625397482.28.AB44BA6 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf09.hostedemail.com (Postfix) with ESMTP id 3AF15140010 for ; Thu, 30 Mar 2023 11:50:38 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=HXjJYQmy; spf=none (imf09.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177039; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4c+6RTABm8IpgO4JlTWZJ1FT6J6ytqIY0EMrGiPzEdU=; b=P+TVTkehlamcqRjCn82fxV041ILIyI2h7O7TwNfsGvTaStAh6TJN3c4GoqlBph+VIvxl0X l3gSMvT9eMljkJBUZ8bWY1yJCSEPrnQSZZmJeHJwfD5kymUMuelMCx6BhEFUYNj9Qpo+D7 EkrBijUYE7f5n2wqcEO3a4iI6f2ZBiw= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=HXjJYQmy; spf=none (imf09.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177039; a=rsa-sha256; cv=none; b=ZjVqLf81Ay4yqm6AqE649JTnNdYeOj8pMlKcxSfpmAwup0uOuhCBAVIel4hhChCn/z1LFw mn2jJU1h2pAruD1Nn4QYfQ3aYdCDNcv5C8mnO/EgbPGBZw2tjbZ7VprMscfTGdVPxGCNbp Z/zJGDt4i82dS/+g3DC+qaJfDZ7BN84= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177039; x=1711713039; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EdpB5LGN65LHmZVLPoPRNthJZeU3YLHWwC+hwkbMqQk=; b=HXjJYQmysKy8ldWzwbx7nHjQRt/xem4aw1nxqcD65yKkdTVIZav1joLj Bg3QOxS7qCy6Kcy+Cw/WNyGn5E8QDk4gbL2LzdRyd4h5K7S5qNUQ4DMAC RnzId96ew/NZt0hM6Z4DOn5WKQ2yMQbHmlOMzwf2PdBd7Ei3qCkeSmfly J45RzYE358qHt4iekrh9fVa7SrxbwoacGj+4gfQiQ/28VhajwtSDI25mU wFeZldh+/6y40rZP00QGdb1Bq2PvNeevym7W0QMuoPha/sIKOLMYId2Fr cpfeqpYLyna5lkvINVWpt329usNYRu4igEk1QovIdzFzWbCoXugh5klIx A==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="339868502" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="339868502" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="1014401451" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="1014401451" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:17 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 54CD11046EE; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv9 10/14] x86/mm: Provide helpers for unaccepted memory Date: Thu, 30 Mar 2023 14:49:52 +0300 Message-Id: <20230330114956.20342-11-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 3AF15140010 X-Rspam-User: X-Stat-Signature: 7u48n4wy3r3rzthoh4gsadomd6rcaadr X-HE-Tag: 1680177038-755977 X-HE-Meta: U2FsdGVkX18RtobPXdSlym0O4ZXjLS42aQBfcvgXVYobGlfdbFSV3tzj1hD1O/n70frBoIVifPBG/y+juZ6M74k3Dq6UkeDON8OM9pGRuEVV9l4UHODH0CKUa5OPM6ZAaXoXM276yTb4Sdy6cn4OK1MGO6SrOqC6kO1A7skLV5Lr6ovkU2CE1E/xJ4Y0hYD5CEPmsU+NE0J7WiOL3lSgxVn7JO0ckoKS2lGNAVIucEoaZwBLH2dHkpqPg1U2tkU9EzMdXABMQW/fIcclUJgG4sasS3Lg+ER21hkbjqUj7dpnC+TLXJ2ZRf89IGTW2HRo/T+PzJbJlP3cM5oG26kBUbEudeI+zcKEOTvGBrv0eFYBS7sfzT2CRjC3cgvoya4pgoGj42kSgP5QyA2pvG+DqODS7jXRDtkyOjBt71j6CeSlMZ/DDycgAHKZxuvDQZnS5zoh9bfFpewjyomuMcVES/u1iXbzIZBfe1A1oSBYX4OWuXJDgj2zBVxqKEST53pn+RDjjgKWNc+z2aAOhXS6fC23wnwlpBDJTeByte1dGNQNw5NgY89Kul1j1oJ9OV1DQdhj4LK6MDBG4smXLg2m5gn6phXqTc0bFApg3p0QrRObJ93G2GTlRoLEheujo7R1gdq+vNtGiMOvwpTCU/MgF7Se1qB4RuwwPyxwDPiIiAfCuyHqTie/kqvXZHEvWGyrsR3Y1CRposbm/x8LbWjKtZrY0pc4tsUi6VzOwLdtGBphT6hHkB0y77aVe7oagmumbsX1fKMSsHrn8KlAJtF9OM/vt7SPeOtg80Fumd+VGolE02JgoVh5/IJzeY8mymPT0J81w4x3jh+uIcgP652rb4fYY6Y9AiPycohxJY9vFRqDZ4f2uQ/yFPXWWSmqSt/yC3fs0WNCr8ls1+xlxvTQ6/yapnGS6deh4M+AELUv9jjUMNLcqf3xf0rjeCw2BB9EotyVyEIykC/o+OgmwYb B8PW8skc UPUeh60msLfjQZXtZwbSL8f8juLXP29JhIxk4AvthjgZ3apy5cuD6Fosht4CbjWpK+3NKSrsTdZGp0O9B+UDI5zmqEYL0Cjm3BjNMvlzTEWPbH4At6TenibryvfHZgHDnRZc0GM7UvqHUiNV79vf00eNeYYMF6MhKlj7V X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Core-mm requires few helpers to support unaccepted memory: - accept_memory() checks the range of addresses against the bitmap and accept memory if needed. - range_contains_unaccepted_memory() checks if anything within the range requires acceptance. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/page.h | 3 ++ arch/x86/include/asm/unaccepted_memory.h | 4 ++ arch/x86/mm/Makefile | 2 + arch/x86/mm/unaccepted_memory.c | 61 ++++++++++++++++++++++++ 4 files changed, 70 insertions(+) create mode 100644 arch/x86/mm/unaccepted_memory.c diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h index d18e5c332cb9..4bab2bb2c9c0 100644 --- a/arch/x86/include/asm/page.h +++ b/arch/x86/include/asm/page.h @@ -19,6 +19,9 @@ struct page; #include + +#include + extern struct range pfn_mapped[]; extern int nr_pfn_mapped; diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/asm/unaccepted_memory.h index 41fbfc798100..89fc91c61560 100644 --- a/arch/x86/include/asm/unaccepted_memory.h +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -7,6 +7,10 @@ struct boot_params; void process_unaccepted_memory(struct boot_params *params, u64 start, u64 num); +#ifdef CONFIG_UNACCEPTED_MEMORY + void accept_memory(phys_addr_t start, phys_addr_t end); +bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end); #endif +#endif diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index c80febc44cd2..b0ef1755e5c8 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -67,3 +67,5 @@ obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_amd.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_identity.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o + +obj-$(CONFIG_UNACCEPTED_MEMORY) += unaccepted_memory.o diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c new file mode 100644 index 000000000000..1df918b21469 --- /dev/null +++ b/arch/x86/mm/unaccepted_memory.c @@ -0,0 +1,61 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include + +#include +#include +#include + +/* Protects unaccepted memory bitmap */ +static DEFINE_SPINLOCK(unaccepted_memory_lock); + +void accept_memory(phys_addr_t start, phys_addr_t end) +{ + unsigned long range_start, range_end; + unsigned long *bitmap; + unsigned long flags; + + if (!boot_params.unaccepted_memory) + return; + + bitmap = __va(boot_params.unaccepted_memory); + range_start = start / PMD_SIZE; + + spin_lock_irqsave(&unaccepted_memory_lock, flags); + for_each_set_bitrange_from(range_start, range_end, bitmap, + DIV_ROUND_UP(end, PMD_SIZE)) { + unsigned long len = range_end - range_start; + + /* Platform-specific memory-acceptance call goes here */ + panic("Cannot accept memory: unknown platform\n"); + bitmap_clear(bitmap, range_start, len); + } + spin_unlock_irqrestore(&unaccepted_memory_lock, flags); +} + +bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) +{ + unsigned long *bitmap; + unsigned long flags; + bool ret = false; + + if (!boot_params.unaccepted_memory) + return 0; + + bitmap = __va(boot_params.unaccepted_memory); + + spin_lock_irqsave(&unaccepted_memory_lock, flags); + while (start < end) { + if (test_bit(start / PMD_SIZE, bitmap)) { + ret = true; + break; + } + + start += PMD_SIZE; + } + spin_unlock_irqrestore(&unaccepted_memory_lock, flags); + + return ret; +} From patchwork Thu Mar 30 11:49:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194075 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40FFEC77B60 for ; Thu, 30 Mar 2023 11:50:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C771F6B0081; Thu, 30 Mar 2023 07:50:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C4DAD6B0085; Thu, 30 Mar 2023 07:50:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DED26B0081; Thu, 30 Mar 2023 07:50:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8602D6B0081 for ; Thu, 30 Mar 2023 07:50:39 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id F159A120918 for ; Thu, 30 Mar 2023 11:50:38 +0000 (UTC) X-FDA: 80625397356.11.244EAA3 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf16.hostedemail.com (Postfix) with ESMTP id B5956180004 for ; Thu, 30 Mar 2023 11:50:36 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=SrIoRkNm; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf16.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177037; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8Nf7y2H48SvI7vjZmdAL4egfrBNrBmenWjONYlXJG6k=; b=joLcPZk0fik4OvaCHt13mOb/DMlq8hxWSA2lzaPLBvTk5RDijzIruoNpNEW7ZZbQXj6d8n 5b2/uNMHTE9VaHer4WX+iKRgrQJFqkfWQ272l+wJ/bfPm1Gxgiq7DCBpp72QdocvdliPRz jFWd0g+fN2cmxGrIcULINxMBc7rYZz8= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=SrIoRkNm; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf16.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177037; a=rsa-sha256; cv=none; b=1nGZBu0J1u81YwJfEWg7QkLleRqOXbdKun8LbPL7/rqzrvSOtodKIYZP/iAHwUxkAUF8XO cIu0F783U/U0+paCvoqWOUyraQku+Oq/tKnNAioKm+i2edw+jJ6WEmcSODP07ZOKIVHa6G fbX2o6HTIoeSdFd+e+rGihsRQUKO9hI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177036; x=1711713036; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wDfbjNCC+5BxiTtoxByvPaljGaBvwiQ8TKW6aMXMCQY=; b=SrIoRkNmkPD0Ykjq5pwDa5A4YFAPkGmUS/Ruld7OZgmKvNsOAjDpfyAa DaaS9fzxpqmUtf+ix1kS/OcP7ucP246S7Y2RmvfZP6e/s4KkfBuSp9Aek wL/a4v9oBEpQVjdDapC/k7I5yzexu1hiCAgiiHJU4hhR5SC6XoXqLyTiL +v+UYpmU8aZOPllIUsPk/8u5vQyhlmyyBgsrHxpNwCOiuiUo9sDQqm5zJ BHqdALWYILlXBppeNUTvjQHM5jdwiy9FWBH31Bkv8jc3bZVSiepa0cddw d1X/8yZ3JKC+QdK2DiEzspWrMUvLofEnCgMzTL+XJVJI9JbhO5uA7+yYy A==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="342756804" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="342756804" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634856514" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="634856514" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:18 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 60182104788; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Dave Hansen Subject: [PATCHv9 11/14] x86/mm: Avoid load_unaligned_zeropad() stepping into unaccepted memory Date: Thu, 30 Mar 2023 14:49:53 +0300 Message-Id: <20230330114956.20342-12-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: B5956180004 X-Stat-Signature: 9gokuxspwiyhxbtscyxetznmq7hzh8x7 X-HE-Tag: 1680177036-710587 X-HE-Meta: U2FsdGVkX19HDb5gOSdUakQiIoVKiCwtiNMAZe6yhXacrxlwMV1AxYzloynzM2THOmeGgaB3oxEbQmIqtp7KBk7pJODPagEdmFY7Y0qXgEd/Lht/MtbO2SdKaysPSXKxzOLJ+9feXIQCKbqcg4dovJbJn2O0akKIf8UIhDlFkNHqHYHhwhVgnm8fuqC5hfKCvT+KVFY0wnwDHdh541Cwo+2mvBXldfqWYwJOuhDbxW+DBef9LIadgTpUYy5rE7sbbXUrRX7caadoSofheyNOQ6zdJEscqu4HZ3kDP4XVq+rxNISsdTM6ZdksyG8tfMiKXO/I9FWSc7TSw2ajt8SgvSy+buP3seO2H1e41xUgmykCNecSvR0niJQIiOiYJsw6cmnsR/WMrfiIkhN+7Q+66qIrg413WvE3kt1OeEAy/XMcAvBjTd28hrHvBRyL3cSzPT2NMJqfKzSy8bGwuppwDcd11nT1cYnsBmUB9qGRnpSSVsBOoVOX3IHwbYadolaR/MJpqa/Whele4dFqvgmkJZ76G23F12QRjZqPA9ijx4WslF5z0K0ZEVHckIN9vvJ+UKGe6Rbo5fNq5fZ76pS/ZnaRV6h0ms/KwgaRdiVsqouTdae6X+1ifLNoIp01Aq9zx0tYQ1ZZC7LDdtUDtTrh+HUrFibe1yNJ03X639kHmxH9IfKybvLmfJAjP8wImTFp5ZcYL/fs18Odrt5IBBMmJnKPKg7PmqRuGT72J7pRk5fETjmYis00tI1Eq/GOCadcRb8bBje5g7uz1NmGkPr7HUd83d4UO93xUx2XgtykD3INV9qs0diQSjemlE6e3C0iAhzfdrGG1o1AATG/f7bB2Wf4T+5FfVyXc276z54xdfq7k8Qdx8RLBDDJi824oIScLP6aBxPb7XdyhgpnM1dRo1eb6srnpFOIvwuDsFwYfHIOeJJAd/eJdGV/3ACwHRXNheYgM+B4kYrv58CqpNx oyr1ogrc 2uJKrT+lWQWnfIE2GwX2Ses/rZ290h2kJbkM6mSACkNIJ8p1MG5AJ470D6+8c5D7bm4RxOIYwgvWTbsbd1o0VdzLbXkAz/E2zRDdA+VaWhjxkWtanlUJDIf6SxXCqpojsXufCQGCvz92JXQS03/ZgvaTsKe220G+UF/ZUpduSGqumUs1nh+HKJVzJLrxq5Qwo1uK9I2DIL3dWPcc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: load_unaligned_zeropad() can lead to unwanted loads across page boundaries. The unwanted loads are typically harmless. But, they might be made to totally unrelated or even unmapped memory. load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now #VE) to recover from these unwanted loads. But, this approach does not work for unaccepted memory. For TDX, a load from unaccepted memory will not lead to a recoverable exception within the guest. The guest will exit to the VMM where the only recourse is to terminate the guest. There are three parts to fix this issue and comprehensively avoid access to unaccepted memory. Together these ensure that an extra "guard" page is accepted in addition to the memory that needs to be used. 1. Implicitly extend the range_contains_unaccepted_memory(start, end) checks up to end+2M if 'end' is aligned on a 2M boundary. It may require checking 2M chunk beyond end of RAM. The bitmap allocation is modified to accommodate this. 2. Implicitly extend accept_memory(start, end) to end+2M if 'end' is aligned on a 2M boundary. 3. Set PageUnaccepted() on both memory that itself needs to be accepted *and* memory where the next page needs to be accepted. Essentially, make PageUnaccepted(page) a marker for whether work needs to be done to make 'page' usable. That work might include accepting pages in addition to 'page' itself. Side note: This leads to something strange. Pages which were accepted at boot, marked by the firmware as accepted and will never _need_ to be accepted might have PageUnaccepted() set on them. PageUnaccepted(page) is a cue to ensure that the next page is accepted before 'page' can be used. This is an actual, real-world problem which was discovered during TDX testing. Signed-off-by: Kirill A. Shutemov Reviewed-by: Dave Hansen --- arch/x86/mm/unaccepted_memory.c | 39 +++++++++++++++++++++++++ drivers/firmware/efi/libstub/x86-stub.c | 7 +++++ 2 files changed, 46 insertions(+) diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c index 1df918b21469..a0a58486eb74 100644 --- a/arch/x86/mm/unaccepted_memory.c +++ b/arch/x86/mm/unaccepted_memory.c @@ -23,6 +23,38 @@ void accept_memory(phys_addr_t start, phys_addr_t end) bitmap = __va(boot_params.unaccepted_memory); range_start = start / PMD_SIZE; + /* + * load_unaligned_zeropad() can lead to unwanted loads across page + * boundaries. The unwanted loads are typically harmless. But, they + * might be made to totally unrelated or even unmapped memory. + * load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now + * #VE) to recover from these unwanted loads. + * + * But, this approach does not work for unaccepted memory. For TDX, a + * load from unaccepted memory will not lead to a recoverable exception + * within the guest. The guest will exit to the VMM where the only + * recourse is to terminate the guest. + * + * There are three parts to fix this issue and comprehensively avoid + * access to unaccepted memory. Together these ensure that an extra + * "guard" page is accepted in addition to the memory that needs to be + * used: + * + * 1. Implicitly extend the range_contains_unaccepted_memory(start, end) + * checks up to end+2M if 'end' is aligned on a 2M boundary. + * + * 2. Implicitly extend accept_memory(start, end) to end+2M if 'end' is + * aligned on a 2M boundary. (immediately following this comment) + * + * 3. Set PageUnaccepted() on both memory that itself needs to be + * accepted *and* memory where the next page needs to be accepted. + * Essentially, make PageUnaccepted(page) a marker for whether work + * needs to be done to make 'page' usable. That work might include + * accepting pages in addition to 'page' itself. + */ + if (!(end % PMD_SIZE)) + end += PMD_SIZE; + spin_lock_irqsave(&unaccepted_memory_lock, flags); for_each_set_bitrange_from(range_start, range_end, bitmap, DIV_ROUND_UP(end, PMD_SIZE)) { @@ -46,6 +78,13 @@ bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) bitmap = __va(boot_params.unaccepted_memory); + /* + * Also consider the unaccepted state of the *next* page. See fix #1 in + * the comment on load_unaligned_zeropad() in accept_memory(). + */ + if (!(end % PMD_SIZE)) + end += PMD_SIZE; + spin_lock_irqsave(&unaccepted_memory_lock, flags); while (start < end) { if (test_bit(start / PMD_SIZE, bitmap)) { diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index 1643ddbde249..1afe7b5b02e1 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -715,6 +715,13 @@ static efi_status_t allocate_unaccepted_bitmap(struct boot_params *params, return EFI_SUCCESS; } + /* + * range_contains_unaccepted_memory() may need to check one 2M chunk + * beyond the end of RAM to deal with load_unaligned_zeropad(). Make + * sure that the bitmap is large enough handle it. + */ + max_addr += PMD_SIZE; + /* * If unaccepted memory is present, allocate a bitmap to track what * memory has to be accepted before access. From patchwork Thu Mar 30 11:49:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194079 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 600F2C77B60 for ; Thu, 30 Mar 2023 11:50:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DFF0C6B0088; Thu, 30 Mar 2023 07:50:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D5F496B0089; Thu, 30 Mar 2023 07:50:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BDF78900002; Thu, 30 Mar 2023 07:50:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A5B906B0089 for ; Thu, 30 Mar 2023 07:50:41 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7868D1201B4 for ; Thu, 30 Mar 2023 11:50:41 +0000 (UTC) X-FDA: 80625397482.24.193EA63 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf11.hostedemail.com (Postfix) with ESMTP id 553C54001A for ; Thu, 30 Mar 2023 11:50:39 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=EFXNcdOi; spf=none (imf11.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177039; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/VRyLYcUziIyvK5rDEWkSsh184m1aPMbQy/X3mwxy8E=; b=G4O2OusA3yQH9Ym2k7SVR6AX5NUiB5efYEHBGmYQNHAR5LPBZYJKMSaGtQrlUiZ60kXx4E hHf0dnl0wTroDCqc0IAoxlkfh0grr2NbHK6UXhhv3MN3iuzETLn+qlwMQTLH91Mwt/aY5P 5cBP4Vw5efMQ+2WMBI2NOjT1eOsGlvM= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=EFXNcdOi; spf=none (imf11.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177039; a=rsa-sha256; cv=none; b=Kz9Z2yWPgVCmizR6Yf5vct0A9l3BzvoKkk+gBdvPW/AIUa1fa9n8Lgdt8f0CgaT7+2Wl97 0ZGitbykbfG8pZ1J/L9AZzQ1HQLT5l2jmMBBQ4Zm2Yu9BaSTraYVkN1I6kRVqZYTqXH6LX Tgok5YIrrxYzsoEvjPMAGZjnESDsM8k= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177039; x=1711713039; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aSDNa5qy6AjyVCvAMnDoppa8Rk9jEp1+/qGA3XqX6A0=; b=EFXNcdOiOEnRjfOrpRNt3t12aA14K3WbwBmFRcnJhqgOlEiNsxIaV+lO 0pKWy0PBoClDUfqmPpkbxvhHMpCYVbrNr415MXG/F+rNwno9auM0iA0JT vrBspFfwEsGaEcD2kclJ2cBY2URdwCVb5ms6DTY403ZNpIHG1j8kW0cjg RuWMlzb/Z36EZEh3lnqlJ+wkhsTqlAB6i+lbDtFO4i+qt/C3oE1Zd3Y// 9MGP2Esr8gKnam7yZvVxVac0TmjSIJyLhRcwHwOkKvBL3UrqohP48FCCs 2j+CCr/RHsyQckmcqCmk6emP9lEO5+nvtVRZeHLHS2Q3JLsdCc74eQM2X w==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="339868528" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="339868528" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="1014401453" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="1014401453" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:17 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 6BA50104CA8; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Dave Hansen Subject: [PATCHv9 12/14] x86/tdx: Make _tdx_hypercall() and __tdx_module_call() available in boot stub Date: Thu, 30 Mar 2023 14:49:54 +0300 Message-Id: <20230330114956.20342-13-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 553C54001A X-Rspam-User: X-Stat-Signature: xxtux7k3sbn7zorx5ie3bpzwww75nrub X-HE-Tag: 1680177039-234827 X-HE-Meta: U2FsdGVkX1+ihHpzD0/HA2ZmS3+ppyuc6+h+/9PKCduIysL7huClCPisHuF+CxJ6jgq1sQhusvEOJ5uipvPzBI6bXOAAhpEVqRNnS9M1U+UYZjl9Xepf2x8TIO36c4xRzWbYRKePb5HF69o1kaODQbXkC3DGDoD1GjSu+KoNFGfqWIezxdkdgY4anyADU8cJ6vihDE4iq97x4RL2oAeDMMf2xi0rjWSzSSk7DahGSYC30HnOh/O7WzPjjv2D1+MhKVurwKVD3rxo8czNs6oLSMdzFi7MwIuvPPHC3g3gHW3jITUBeSUUEAtkX6zw8Ho9ZnKqbk/fLm2Hxb1AT19a8+JZS/wrUeEzhMvj0qau7plYcsYM77aDgPE6GkL5b+dQFqvDuNkgPIoB8uB6XB5G6jaEvZln0+LEkODHD4jmocgZjzx/RlvL+juXwN6lIWiejUbjDtiDG3EixzL3NtKRrkFuv10HUx04Y/UJil1d44nEdCw7xjLF6QA5QUXIhaJoTUzy3zbq+6SeRJeueGVTd8POalbbCfsRupmakCkhwcthmtlUOyGugpZjC0g/zc3fyLGONV8/sjxTJc/Ka6keptGzDYwP0XKyipL83GofO3mMXbOalTOfl4ndk5zI7KizLi7O+PYeuI4GXkRqRiI3sj3CqiBjZJXxtT7k2HoRBcyvkWoVkw8p8yVG/Z4hDw542YPJqCVUQme5IaPRxwO9GriiDjGBuEXsDFshohgkVLkNiuAYIFZaIkf9qs8NL/sFBAlAAPW1agxM4SacJie/eEbku828m0OCQO/fdqEK2miMLIgs2q72qqx90Q82MYh2Z4dmkYjDcNW7W0n6ZSrsvwG/yFFCq9aTg8EQlxZdMiDm28RXFYJE1A/LyrOYbXYqgzXJnBLk1Lo9UWel6uEeRFrUm4q3V8KKVmSaZZbZkR0py1pPLx88u5BFsH/Y4FzLw7GshO3VraR+Tx+qr5v BeuhitBO bEOOoJQU4ej0YKhhSzerxP+H5lsoyG5Pl270Trz0T06y6ckNEzF17F1EE+zBOeMYZAb3tJtBhDqtun9tCWU5jUgQlaEUpss15R4f1QOOz3pmaZ/MK1nTy0hE8EY4KiOwPd9YeBWJIVeKmxk8Y/uqeT+5tciSXl9YiYh+KC4mB7vFLoObBoqhOpkCJdpyeMiet7Y83+w1RiE71Nrc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Memory acceptance requires a hypercall and one or multiple module calls. Make helpers for the calls available in boot stub. It has to accept memory where kernel image and initrd are placed. Signed-off-by: Kirill A. Shutemov Reviewed-by: Dave Hansen --- arch/x86/coco/tdx/tdx.c | 32 ------------------- arch/x86/include/asm/shared/tdx.h | 51 +++++++++++++++++++++++++++++++ arch/x86/include/asm/tdx.h | 19 ------------ 3 files changed, 51 insertions(+), 51 deletions(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 055300e08fb3..a9893f44288f 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -14,20 +14,6 @@ #include #include -/* TDX module Call Leaf IDs */ -#define TDX_GET_INFO 1 -#define TDX_GET_VEINFO 3 -#define TDX_GET_REPORT 4 -#define TDX_ACCEPT_PAGE 6 -#define TDX_WR 8 - -/* TDCS fields. To be used by TDG.VM.WR and TDG.VM.RD module calls */ -#define TDCS_NOTIFY_ENABLES 0x9100000000000010 - -/* TDX hypercall Leaf IDs */ -#define TDVMCALL_MAP_GPA 0x10001 -#define TDVMCALL_REPORT_FATAL_ERROR 0x10003 - /* MMIO direction */ #define EPT_READ 0 #define EPT_WRITE 1 @@ -51,24 +37,6 @@ #define TDREPORT_SUBTYPE_0 0 -/* - * Wrapper for standard use of __tdx_hypercall with no output aside from - * return code. - */ -static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15) -{ - struct tdx_hypercall_args args = { - .r10 = TDX_HYPERCALL_STANDARD, - .r11 = fn, - .r12 = r12, - .r13 = r13, - .r14 = r14, - .r15 = r15, - }; - - return __tdx_hypercall(&args, 0); -} - /* Called from __tdx_hypercall() for unrecoverable failure */ noinstr void __tdx_hypercall_failed(void) { diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h index 4a03993e0785..562b3f4cbde8 100644 --- a/arch/x86/include/asm/shared/tdx.h +++ b/arch/x86/include/asm/shared/tdx.h @@ -12,6 +12,20 @@ #define TDX_CPUID_LEAF_ID 0x21 #define TDX_IDENT "IntelTDX " +/* TDX module Call Leaf IDs */ +#define TDX_GET_INFO 1 +#define TDX_GET_VEINFO 3 +#define TDX_GET_REPORT 4 +#define TDX_ACCEPT_PAGE 6 +#define TDX_WR 8 + +/* TDCS fields. To be used by TDG.VM.WR and TDG.VM.RD module calls */ +#define TDCS_NOTIFY_ENABLES 0x9100000000000010 + +/* TDX hypercall Leaf IDs */ +#define TDVMCALL_MAP_GPA 0x10001 +#define TDVMCALL_REPORT_FATAL_ERROR 0x10003 + #ifndef __ASSEMBLY__ /* @@ -38,8 +52,45 @@ struct tdx_hypercall_args { /* Used to request services from the VMM */ u64 __tdx_hypercall(struct tdx_hypercall_args *args, unsigned long flags); +/* + * Wrapper for standard use of __tdx_hypercall with no output aside from + * return code. + */ +static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15) +{ + struct tdx_hypercall_args args = { + .r10 = TDX_HYPERCALL_STANDARD, + .r11 = fn, + .r12 = r12, + .r13 = r13, + .r14 = r14, + .r15 = r15, + }; + + return __tdx_hypercall(&args, 0); +} + + /* Called from __tdx_hypercall() for unrecoverable failure */ void __tdx_hypercall_failed(void); +/* + * Used in __tdx_module_call() to gather the output registers' values of the + * TDCALL instruction when requesting services from the TDX module. This is a + * software only structure and not part of the TDX module/VMM ABI + */ +struct tdx_module_output { + u64 rcx; + u64 rdx; + u64 r8; + u64 r9; + u64 r10; + u64 r11; +}; + +/* Used to communicate with the TDX module */ +u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, + struct tdx_module_output *out); + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_SHARED_TDX_H */ diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 28d889c9aa16..234197ec17e4 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -20,21 +20,6 @@ #ifndef __ASSEMBLY__ -/* - * Used to gather the output registers values of the TDCALL and SEAMCALL - * instructions when requesting services from the TDX module. - * - * This is a software only structure and not part of the TDX module/VMM ABI. - */ -struct tdx_module_output { - u64 rcx; - u64 rdx; - u64 r8; - u64 r9; - u64 r10; - u64 r11; -}; - /* * Used by the #VE exception handler to gather the #VE exception * info from the TDX module. This is a software only structure @@ -55,10 +40,6 @@ struct ve_info { void __init tdx_early_init(void); -/* Used to communicate with the TDX module */ -u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, - struct tdx_module_output *out); - void tdx_get_ve_info(struct ve_info *ve); bool tdx_handle_virt_exception(struct pt_regs *regs, struct ve_info *ve); From patchwork Thu Mar 30 11:49:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194080 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C83D3C6FD1D for ; Thu, 30 Mar 2023 11:50:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E25AA6B0089; Thu, 30 Mar 2023 07:50:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DD69E6B008A; Thu, 30 Mar 2023 07:50:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4D27900002; Thu, 30 Mar 2023 07:50:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B35766B0089 for ; Thu, 30 Mar 2023 07:50:42 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8FB3080137 for ; Thu, 30 Mar 2023 11:50:42 +0000 (UTC) X-FDA: 80625397524.20.AB58D3E Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf28.hostedemail.com (Postfix) with ESMTP id 7082FC000B for ; Thu, 30 Mar 2023 11:50:40 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=XoioXYtM; spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177040; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4u4GBPVn1QeuU6neCfosEZSZ/2o4Hx9YEYk8G/HGEwU=; b=1+8e5bSAzUYMnik4DgTEY0rvVxIoi0UPKut1C2bPpm3HONExfxg52T2k1+P+GQ4I7vrhik fPOgb8vJa39S0MI9zQCH+NgkHV5inKwG14wE6rHi/aMgIZrkqSlov9nZDYlUWkzISMDv64 1DLHpkikJOZ6Z8uHIYgz4jopUX5qHIQ= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=XoioXYtM; spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177040; a=rsa-sha256; cv=none; b=oR7q6CaWW1HERpu6RyKdbEt1seuqGBVooNylvrTIPlmJBoI6WrEKn82eAkLTuBoFf6v3xN B7mI7mertO/t/oscdvuAua3jYn0oyCT6M0otgKUvEalQ5pjp2jPglkDX4AelEE85KUEZN5 /3aZkkbb+OolqJ3ldEuwiopsPQBZQa0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177040; x=1711713040; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MmsAbUP+stG7qEWYYDNAAzwlLkIUGBDyKypaxDlWzw4=; b=XoioXYtMttTfk2MM/d0Mox/UXuuW4MubJ7m8b9TIg4pHZqv1oCvqTILc 2ftihksspSDm6rL+a/04WAWX9xgDwks+wLT6wp10IRfRCheHMyJOXjNuQ nGrIMYI/TB+XJtUigT3n1GM05u7mn4KggDshGZmsRUSxTTiL11sXtzkLS 2Z2yFHcAHFC/hIGyWR9H6vFFSiY+6v2bMH7Nr7qn07MNLgx90QzNluIhs 3JEe/ptu65TMWbPWlhU89brfmK2f/Hw6lpZLtAw6yg8q9tAj/wV5k5qYW hCgQdhvpwSFMISXtfeWeOF/7k+D2h+AVAFZhgX/uYt4RrhNdcQunBnEr5 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="339868546" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="339868546" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="1014401458" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="1014401458" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:18 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 75D41104D1D; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Dave Hansen Subject: [PATCHv9 13/14] x86/tdx: Refactor try_accept_one() Date: Thu, 30 Mar 2023 14:49:55 +0300 Message-Id: <20230330114956.20342-14-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: jm6t6rtx98cmih75ozzmayirnn8ickse X-Rspamd-Queue-Id: 7082FC000B X-HE-Tag: 1680177040-29315 X-HE-Meta: U2FsdGVkX18s5DiirCROTyFOollQ4UFDrLOlGht91phm/ITVqAzPESPpoMU1pe3l/JNLLql6Oay/S9CspPGl2QHMEyw2zkLYqMkWRS0aS8DAIN0SgQF1nK6dYyz5d3lUC2jDflT8zlSOzWeXNaegi9ssNHWmIAET2LfavcwFwU6AmN9dTEIE+CvFS9xlzYi9SrJriuPkdZkCNjRfRp78kWqjK2pq+bCL+7aCMMfEtwxxnpfPT4VRtKPtiZJsdrOqTfODi4wzEpKnbCU4Kle7K+gckzEGzzrA8nTzUQI/hSHj0FaNBlHEx0WzzvCS2C7WTgGvf+HjX19s7IuKjIPscNhKcm58cba4J9L6/iHKsavLPCsrZDsm/P4EnI6SaJqpEycnbuEssS9/ZKoo6W0EOZBkLUuVfW0TOS6y73QGUzGM6VpZR212DfVxJpd0mQB7UzgQL2zmsfnAtXzoPVbxIsgCnpcpHZF05f2DJCTVmjwJT7BlFFGuUap/lfM/lMsJOZhrv0T7/3L9grSvYUp1v7mTBaKAHfK+9RPRg3yat6U/U3Cs6nH9eGg2oif9hFzKvMAwwVkkOGlEiGdzZ6j+6eoUKJEAnPDFeAPhYYXG/cKUc79O+p2pPn2hMRjQQXF6S2PC1EwGpj3wpteKQP3GP5RI3LtMr38kE3LsYFOjTtJ91bYV4rTO/Qi74LUpX1b4EVPRoRlGM+GkfEo9apNClkrxwe+dQUYKAsRQ2Ok3AbRt/hHmUFUzgAl2G9bYWCdSQanFJYyDiKpoolFZJYMphKVwz4id6GxfHNWp2D/AoKvYL8Uzs/r/UQYPqggoGyt9Jy6KncTrrI8MQoBrU0aVoud2RIk1On5PCsb0nJjFM8/JE1ibPKi47G9ADqRT2kwe/eqkgBuYGUcIkdazKZQKjqcWc6rfjdxyb471mJ6j0mhNWhE7oHtwGLYniVxR/OrY3/4GR11VxuX+nKPVK4m V76hdTSR k3BZ7hwE/MK+VqFtJDgKJK6fUgrQ4Ei7ccxSsXhP/OPAm2AvDOzWM63a/enim5077CK0NWZBUzjGsd4ATZoAzvq9f1fHeb/oRJ+iSb2f8Je/RaxQKgowmgbKoHQLNfvq7r73g/NIzWiDSZFkaEQlNk5g10p9LtV1XfFGxKs94QQfIwbT89/PJgPcqOSQBQGmpF6/UT2S5+gqeWw8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Rework try_accept_one() to return accepted size instead of modifying 'start' inside the helper. It makes 'start' in-only argument and streamlines code on the caller side. Signed-off-by: Kirill A. Shutemov Suggested-by: Borislav Petkov Reviewed-by: Dave Hansen --- arch/x86/coco/tdx/tdx.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index a9893f44288f..9e6557d7514c 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -713,18 +713,18 @@ static bool tdx_cache_flush_required(void) return true; } -static bool try_accept_one(phys_addr_t *start, unsigned long len, - enum pg_level pg_level) +static unsigned long try_accept_one(phys_addr_t start, unsigned long len, + enum pg_level pg_level) { unsigned long accept_size = page_level_size(pg_level); u64 tdcall_rcx; u8 page_size; - if (!IS_ALIGNED(*start, accept_size)) - return false; + if (!IS_ALIGNED(start, accept_size)) + return 0; if (len < accept_size) - return false; + return 0; /* * Pass the page physical address to the TDX module to accept the @@ -743,15 +743,14 @@ static bool try_accept_one(phys_addr_t *start, unsigned long len, page_size = 2; break; default: - return false; + return 0; } - tdcall_rcx = *start | page_size; + tdcall_rcx = start | page_size; if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL)) - return false; + return 0; - *start += accept_size; - return true; + return accept_size; } /* @@ -788,21 +787,22 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) */ while (start < end) { unsigned long len = end - start; + unsigned long accept_size; /* * Try larger accepts first. It gives chance to VMM to keep - * 1G/2M SEPT entries where possible and speeds up process by - * cutting number of hypercalls (if successful). + * 1G/2M Secure EPT entries where possible and speeds up + * process by cutting number of hypercalls (if successful). */ - if (try_accept_one(&start, len, PG_LEVEL_1G)) - continue; - - if (try_accept_one(&start, len, PG_LEVEL_2M)) - continue; - - if (!try_accept_one(&start, len, PG_LEVEL_4K)) + accept_size = try_accept_one(start, len, PG_LEVEL_1G); + if (!accept_size) + accept_size = try_accept_one(start, len, PG_LEVEL_2M); + if (!accept_size) + accept_size = try_accept_one(start, len, PG_LEVEL_4K); + if (!accept_size) return false; + start += accept_size; } return true; From patchwork Thu Mar 30 11:49:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13194077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8265C761AF for ; Thu, 30 Mar 2023 11:50:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16B4E6B0083; Thu, 30 Mar 2023 07:50:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0BC3F6B0087; Thu, 30 Mar 2023 07:50:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E21A36B0089; Thu, 30 Mar 2023 07:50:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C6DF56B0083 for ; Thu, 30 Mar 2023 07:50:40 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 717B340F03 for ; Thu, 30 Mar 2023 11:50:40 +0000 (UTC) X-FDA: 80625397440.06.70509D6 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf03.hostedemail.com (Postfix) with ESMTP id 265F92001B for ; Thu, 30 Mar 2023 11:50:36 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ZnpJksPU; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf03.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680177037; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MF0qRcUciRiNtFXFbwuMYnO6py7KMxwUzVqLVroA+VI=; b=dv7t6unbb+SSfYHXnhFUNglho132S52RvqGGc2SB8hTq2gOvXZaQ6C9OkmnJBxznoOjN/V AqEk3xUMmdowIqMXkDQ72rG5IaULuzkZcvSdhrSlaK62YzrK3uncFgSteD9aHxzswocNge /O5ZgxLx3t2KoHGBWN0eErYnj4HjSyA= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ZnpJksPU; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf03.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.24) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680177037; a=rsa-sha256; cv=none; b=jkRqVg3+F8+12FheMFJa1+GWuVKGXNILYfMJRzapylXEYCRj1vPlBiZnszsSVtw9npBK1q 70iKrau/fmKVZjJPp+MEFZ1qVGTbZlh3FVPAGaCibheTL7jm4Z8fDi0ud0deOS6NAUwXAE f8QI8mrGkbduoSYdtCeenFG7NYbzMoQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680177037; x=1711713037; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xeABudaJnL+acO+dda0KFPliSd+zujBisTvxupxsoaI=; b=ZnpJksPUDstS2EXbvPYVx3koNUdghnrSndgrlV71olT94ROKt3hUOuIN qxIvxM+v3R8UZ/c/BpYOPfY+FlXoCCjw6mLZS5fGlBDelTRDak8RDZcj/ U9mPE78VcUm/h6Re4Tjc9s/Mm85KnDHzgZtYMtLRxaFIIi0rIgWDjE1o5 jhvlq8Pt5uEU5DXFBUXcezXLZUfzC73JjBd0GgrWwiWHKVNTiyzw6sbvH 9J+sfObaSsboJuNmYh8fZvCSLAcINsTC36WhIopUrytPBqqodsVMTvoHz 2fuOXXtuHkdH1cGQqScjQw8JUTseW/lTSAN0KvhNg2jFm9N0m4a8SHOJZ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="342756810" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="342756810" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634856512" X-IronPort-AV: E=Sophos;i="5.98,303,1673942400"; d="scan'208";a="634856512" Received: from ngreburx-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.209.91]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 04:50:18 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 7FDFF1095DF; Thu, 30 Mar 2023 14:50:00 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv9 14/14] x86/tdx: Add unaccepted memory support Date: Thu, 30 Mar 2023 14:49:56 +0300 Message-Id: <20230330114956.20342-15-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 265F92001B X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: yzbotku9afh1ai6n6i5x4cuhu73c1kt5 X-HE-Tag: 1680177036-264632 X-HE-Meta: U2FsdGVkX19EhwFzRnrOwRf4evxCymBEJF+z7ybfWOtGE0gmi6lrk//AkcdxayLgnubqCyqjlinfeds9grL607TYh9/uAR6zAcc9C9AY4eCsb/4Ue1l5J9fBueKJ4t/F4Lyj6RAdmzS+pf1frRmwBLembCyJdabp/1O3kTPF+2XQGSxn4Bl+3m6wCP8PqeZRE9WPFQ9RSGcF+apmA/XhXQtTumM/rUbrTjUBt9R+lWc7w0oMH3ojg5f3Op7+xobU1NucBQQK9hBciiryQvAU1pe1veRklS4duwCS8YA9BTv+nR5eHgmaFBdjdZqjsVRaR9R4XGjPmJVWoaPdbuJNCSq/bMxS7WABfIG6L4GEYiREd5k7GClCNkJD7fn+4SlFcWZ8l5BmIvDTAMwRIDLdQ1bGb+vyKxFRxTVq2kAaEJ6gXEjjNgCGmyAOI7BIpTY3I4XbUMCDmcTNPmmfhq+j810ZZb0CW56VjNhVDdn/eEbyB40/bZBx7A3Q9GSlEz6N3nbczEcGTP/vKRGQSDeMxO/IW5JA+iHXloUTvPCXoh75LlzWi35x0KoU9Xg5oc+v5l6/6gOx895biNr2vLrETP57ekkpSH3s/nU2ZBtGvL9qmOxzp7J1zoWZ/NP6P9AK5osIglDo2rfO4gvPhYA1D7CWhmzrg8QkIfJJWQRgBGt6eRJQ2C9lJsG9sLWnE5McFjfiGP+vW9xHLbSg8mIuiDcxfhO/4KpngzzAbjV7XqQfNgtBjyLZDrkSY9gRKTKBUPRrv17zn+U23F4aPYM11Dh6mw6bciAFXURrOKI96Y+eNB8v95JeYGY/b4UnodItq8M+faMgmKiKg5tlhp8rwneTnBMGOuGxwx473a6EEpWXM44qiCNDZlsfifrz4Mq1b0HBHkOALPk1lIDzZmjeUwPHuzR+uN1O4IJbympZWsV1ceeBgEYpX/uhNBa8+mcFKA2mLnIvVKQL7YxjkZh bRpzSGvS 1SkCo5f90q4uViHHsAQWxdJM0uo5Mzbn++uUFitj4gavMdodgkDrgDRGCIvzw+2mddhVYU8Pr7wkVRggf5tQazMgxiEghPPqbQf/S6q1O4a7tL1jyxZMUegkqUlu5QunIpLOZaQ9hIIp3h+kb2s7TBxDO/B7vNWFJugImzDeJdoW2y1+ak2uzjOIOyk+B8KQcYfczJrvv6vAYHIHvKjzPCWQv/g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hookup TDX-specific code to accept memory. Accepting the memory is the same process as converting memory from shared to private: kernel notifies VMM with MAP_GPA hypercall and then accept pages with ACCEPT_PAGE module call. The implementation in core kernel uses tdx_enc_status_changed(). It already used for converting memory to shared and back for I/O transactions. Boot stub provides own implementation of tdx_accept_memory(). It is similar in structure to tdx_enc_status_changed(), but only cares about converting memory to private. Signed-off-by: Kirill A. Shutemov --- arch/x86/Kconfig | 2 + arch/x86/boot/compressed/Makefile | 2 +- arch/x86/boot/compressed/error.c | 19 ++++++ arch/x86/boot/compressed/error.h | 1 + arch/x86/boot/compressed/mem.c | 33 +++++++++- arch/x86/boot/compressed/tdx-shared.c | 2 + arch/x86/boot/compressed/tdx.c | 39 +++++++++++ arch/x86/coco/tdx/Makefile | 2 +- arch/x86/coco/tdx/tdx-shared.c | 95 +++++++++++++++++++++++++++ arch/x86/coco/tdx/tdx.c | 86 +----------------------- arch/x86/include/asm/shared/tdx.h | 2 + arch/x86/include/asm/tdx.h | 2 + arch/x86/mm/unaccepted_memory.c | 9 ++- 13 files changed, 206 insertions(+), 88 deletions(-) create mode 100644 arch/x86/boot/compressed/tdx-shared.c create mode 100644 arch/x86/coco/tdx/tdx-shared.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index df21fba77db1..448cd869f0bd 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -884,9 +884,11 @@ config INTEL_TDX_GUEST bool "Intel TDX (Trust Domain Extensions) - Guest Support" depends on X86_64 && CPU_SUP_INTEL depends on X86_X2APIC + depends on EFI_STUB select ARCH_HAS_CC_PLATFORM select X86_MEM_ENCRYPT select X86_MCE + select UNACCEPTED_MEMORY help Support running as a guest under Intel TDX. Without this support, the guest kernel can not boot or run under TDX. diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 74f7adee46ad..71d9f71c13eb 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -106,7 +106,7 @@ ifdef CONFIG_X86_64 endif vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o -vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdcall.o +vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdx-shared.o $(obj)/tdcall.o vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/bitmap.o $(obj)/find.o $(obj)/mem.o vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o diff --git a/arch/x86/boot/compressed/error.c b/arch/x86/boot/compressed/error.c index c881878e56d3..5313c5cb2b80 100644 --- a/arch/x86/boot/compressed/error.c +++ b/arch/x86/boot/compressed/error.c @@ -22,3 +22,22 @@ void error(char *m) while (1) asm("hlt"); } + +/* EFI libstub provides vsnprintf() */ +#ifdef CONFIG_EFI_STUB +void panic(const char *fmt, ...) +{ + static char buf[1024]; + va_list args; + int len; + + va_start(args, fmt); + len = vsnprintf(buf, sizeof(buf), fmt, args); + va_end(args); + + if (len && buf[len - 1] == '\n') + buf[len - 1] = '\0'; + + error(buf); +} +#endif diff --git a/arch/x86/boot/compressed/error.h b/arch/x86/boot/compressed/error.h index 1de5821184f1..86fe33b93715 100644 --- a/arch/x86/boot/compressed/error.h +++ b/arch/x86/boot/compressed/error.h @@ -6,5 +6,6 @@ void warn(char *m); void error(char *m) __noreturn; +void panic(const char *fmt, ...) __noreturn __cold; #endif /* BOOT_COMPRESSED_ERROR_H */ diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c index de858a5180b6..e6b92e822ddd 100644 --- a/arch/x86/boot/compressed/mem.c +++ b/arch/x86/boot/compressed/mem.c @@ -5,6 +5,8 @@ #include "error.h" #include "find.h" #include "math.h" +#include "tdx.h" +#include #define PMD_SHIFT 21 #define PMD_SIZE (_AC(1, UL) << PMD_SHIFT) @@ -12,10 +14,39 @@ extern struct boot_params *boot_params; +/* + * accept_memory() and process_unaccepted_memory() called from EFI stub which + * runs before decompresser and its early_tdx_detect(). + * + * Enumerate TDX directly from the early users. + */ +static bool early_is_tdx_guest(void) +{ + static bool once; + static bool is_tdx; + + if (!IS_ENABLED(CONFIG_INTEL_TDX_GUEST)) + return false; + + if (!once) { + u32 eax, sig[3]; + + cpuid_count(TDX_CPUID_LEAF_ID, 0, &eax, + &sig[0], &sig[2], &sig[1]); + is_tdx = !memcmp(TDX_IDENT, sig, sizeof(sig)); + once = true; + } + + return is_tdx; +} + static inline void __accept_memory(phys_addr_t start, phys_addr_t end) { /* Platform-specific memory-acceptance call goes here */ - error("Cannot accept memory"); + if (early_is_tdx_guest()) + tdx_accept_memory(start, end); + else + error("Cannot accept memory: unknown platform\n"); } /* diff --git a/arch/x86/boot/compressed/tdx-shared.c b/arch/x86/boot/compressed/tdx-shared.c new file mode 100644 index 000000000000..5ac43762fe13 --- /dev/null +++ b/arch/x86/boot/compressed/tdx-shared.c @@ -0,0 +1,2 @@ +#include "error.h" +#include "../../coco/tdx/tdx-shared.c" diff --git a/arch/x86/boot/compressed/tdx.c b/arch/x86/boot/compressed/tdx.c index 918a7606f53c..de1d4a87418d 100644 --- a/arch/x86/boot/compressed/tdx.c +++ b/arch/x86/boot/compressed/tdx.c @@ -3,12 +3,17 @@ #include "../cpuflags.h" #include "../string.h" #include "../io.h" +#include "align.h" #include "error.h" +#include "pgtable_types.h" #include #include #include +#include + +static u64 cc_mask; /* Called from __tdx_hypercall() for unrecoverable failure */ void __tdx_hypercall_failed(void) @@ -16,6 +21,38 @@ void __tdx_hypercall_failed(void) error("TDVMCALL failed. TDX module bug?"); } +static u64 get_cc_mask(void) +{ + struct tdx_module_output out; + unsigned int gpa_width; + + /* + * TDINFO TDX module call is used to get the TD execution environment + * information like GPA width, number of available vcpus, debug mode + * information, etc. More details about the ABI can be found in TDX + * Guest-Host-Communication Interface (GHCI), section 2.4.2 TDCALL + * [TDG.VP.INFO]. + * + * The GPA width that comes out of this call is critical. TDX guests + * can not meaningfully run without it. + */ + if (__tdx_module_call(TDX_GET_INFO, 0, 0, 0, 0, &out)) + error("TDCALL GET_INFO failed (Buggy TDX module!)\n"); + + gpa_width = out.rcx & GENMASK(5, 0); + + /* + * The highest bit of a guest physical address is the "sharing" bit. + * Set it for shared pages and clear it for private pages. + */ + return BIT_ULL(gpa_width - 1); +} + +u64 cc_mkdec(u64 val) +{ + return val & ~cc_mask; +} + static inline unsigned int tdx_io_in(int size, u16 port) { struct tdx_hypercall_args args = { @@ -70,6 +107,8 @@ void early_tdx_detect(void) if (memcmp(TDX_IDENT, sig, sizeof(sig))) return; + cc_mask = get_cc_mask(); + /* Use hypercalls instead of I/O instructions */ pio_ops.f_inb = tdx_inb; pio_ops.f_outb = tdx_outb; diff --git a/arch/x86/coco/tdx/Makefile b/arch/x86/coco/tdx/Makefile index 46c55998557d..2c7dcbf1458b 100644 --- a/arch/x86/coco/tdx/Makefile +++ b/arch/x86/coco/tdx/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y += tdx.o tdcall.o +obj-y += tdx.o tdx-shared.o tdcall.o diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c new file mode 100644 index 000000000000..ee74f7bbe806 --- /dev/null +++ b/arch/x86/coco/tdx/tdx-shared.c @@ -0,0 +1,95 @@ +#include +#include + +static unsigned long try_accept_one(phys_addr_t start, unsigned long len, + enum pg_level pg_level) +{ + unsigned long accept_size = page_level_size(pg_level); + u64 tdcall_rcx; + u8 page_size; + + if (!IS_ALIGNED(start, accept_size)) + return 0; + + if (len < accept_size) + return 0; + + /* + * Pass the page physical address to the TDX module to accept the + * pending, private page. + * + * Bits 2:0 of RCX encode page size: 0 - 4K, 1 - 2M, 2 - 1G. + */ + switch (pg_level) { + case PG_LEVEL_4K: + page_size = 0; + break; + case PG_LEVEL_2M: + page_size = 1; + break; + case PG_LEVEL_1G: + page_size = 2; + break; + default: + return 0; + } + + tdcall_rcx = start | page_size; + if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL)) + return 0; + + return accept_size; +} + +bool tdx_enc_status_changed_phys(phys_addr_t start, phys_addr_t end, bool enc) +{ + if (!enc) { + /* Set the shared (decrypted) bits: */ + start |= cc_mkdec(0); + end |= cc_mkdec(0); + } + + /* + * Notify the VMM about page mapping conversion. More info about ABI + * can be found in TDX Guest-Host-Communication Interface (GHCI), + * section "TDG.VP.VMCALL" + */ + if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0)) + return false; + + /* private->shared conversion requires only MapGPA call */ + if (!enc) + return true; + + /* + * For shared->private conversion, accept the page using + * TDX_ACCEPT_PAGE TDX module call. + */ + while (start < end) { + unsigned long len = end - start; + unsigned long accept_size; + + /* + * Try larger accepts first. It gives chance to VMM to keep + * 1G/2M Secure EPT entries where possible and speeds up + * process by cutting number of hypercalls (if successful). + */ + + accept_size = try_accept_one(start, len, PG_LEVEL_1G); + if (!accept_size) + accept_size = try_accept_one(start, len, PG_LEVEL_2M); + if (!accept_size) + accept_size = try_accept_one(start, len, PG_LEVEL_4K); + if (!accept_size) + return false; + start += accept_size; + } + + return true; +} + +void tdx_accept_memory(phys_addr_t start, phys_addr_t end) +{ + if (!tdx_enc_status_changed_phys(start, end, true)) + panic("Accepting memory failed: %#llx-%#llx\n", start, end); +} diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 9e6557d7514c..1392ebc3b406 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -713,46 +713,6 @@ static bool tdx_cache_flush_required(void) return true; } -static unsigned long try_accept_one(phys_addr_t start, unsigned long len, - enum pg_level pg_level) -{ - unsigned long accept_size = page_level_size(pg_level); - u64 tdcall_rcx; - u8 page_size; - - if (!IS_ALIGNED(start, accept_size)) - return 0; - - if (len < accept_size) - return 0; - - /* - * Pass the page physical address to the TDX module to accept the - * pending, private page. - * - * Bits 2:0 of RCX encode page size: 0 - 4K, 1 - 2M, 2 - 1G. - */ - switch (pg_level) { - case PG_LEVEL_4K: - page_size = 0; - break; - case PG_LEVEL_2M: - page_size = 1; - break; - case PG_LEVEL_1G: - page_size = 2; - break; - default: - return 0; - } - - tdcall_rcx = start | page_size; - if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL)) - return 0; - - return accept_size; -} - /* * Inform the VMM of the guest's intent for this physical page: shared with * the VMM or private to the guest. The VMM is expected to change its mapping @@ -761,51 +721,9 @@ static unsigned long try_accept_one(phys_addr_t start, unsigned long len, static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) { phys_addr_t start = __pa(vaddr); - phys_addr_t end = __pa(vaddr + numpages * PAGE_SIZE); - - if (!enc) { - /* Set the shared (decrypted) bits: */ - start |= cc_mkdec(0); - end |= cc_mkdec(0); - } - - /* - * Notify the VMM about page mapping conversion. More info about ABI - * can be found in TDX Guest-Host-Communication Interface (GHCI), - * section "TDG.VP.VMCALL" - */ - if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0)) - return false; - - /* private->shared conversion requires only MapGPA call */ - if (!enc) - return true; + phys_addr_t end = __pa(vaddr + numpages * PAGE_SIZE); - /* - * For shared->private conversion, accept the page using - * TDX_ACCEPT_PAGE TDX module call. - */ - while (start < end) { - unsigned long len = end - start; - unsigned long accept_size; - - /* - * Try larger accepts first. It gives chance to VMM to keep - * 1G/2M Secure EPT entries where possible and speeds up - * process by cutting number of hypercalls (if successful). - */ - - accept_size = try_accept_one(start, len, PG_LEVEL_1G); - if (!accept_size) - accept_size = try_accept_one(start, len, PG_LEVEL_2M); - if (!accept_size) - accept_size = try_accept_one(start, len, PG_LEVEL_4K); - if (!accept_size) - return false; - start += accept_size; - } - - return true; + return tdx_enc_status_changed_phys(start, end, enc); } void __init tdx_early_init(void) diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h index 562b3f4cbde8..3afbba545a0d 100644 --- a/arch/x86/include/asm/shared/tdx.h +++ b/arch/x86/include/asm/shared/tdx.h @@ -92,5 +92,7 @@ struct tdx_module_output { u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, struct tdx_module_output *out); +void tdx_accept_memory(phys_addr_t start, phys_addr_t end); + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_SHARED_TDX_H */ diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 234197ec17e4..3a7340ad9a4b 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -50,6 +50,8 @@ bool tdx_early_handle_ve(struct pt_regs *regs); int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport); +bool tdx_enc_status_changed_phys(phys_addr_t start, phys_addr_t end, bool enc); + #else static inline void tdx_early_init(void) { }; diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c index a0a58486eb74..a521f8c0987d 100644 --- a/arch/x86/mm/unaccepted_memory.c +++ b/arch/x86/mm/unaccepted_memory.c @@ -6,6 +6,7 @@ #include #include +#include #include /* Protects unaccepted memory bitmap */ @@ -61,7 +62,13 @@ void accept_memory(phys_addr_t start, phys_addr_t end) unsigned long len = range_end - range_start; /* Platform-specific memory-acceptance call goes here */ - panic("Cannot accept memory: unknown platform\n"); + if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) { + tdx_accept_memory(range_start * PMD_SIZE, + range_end * PMD_SIZE); + } else { + panic("Cannot accept memory: unknown platform\n"); + } + bitmap_clear(bitmap, range_start, len); } spin_unlock_irqrestore(&unaccepted_memory_lock, flags);