From patchwork Sun May 7 23:46:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 13233957 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1CB8C7EE26 for ; Sun, 7 May 2023 23:46:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CD32E6B0089; Sun, 7 May 2023 19:46:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C363F6B008A; Sun, 7 May 2023 19:46:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F3FD6B0092; Sun, 7 May 2023 19:46:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 82E336B008A for ; Sun, 7 May 2023 19:46:49 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 42C9412026D for ; Sun, 7 May 2023 23:46:49 +0000 (UTC) X-FDA: 80765096538.01.B76AE60 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by imf15.hostedemail.com (Postfix) with ESMTP id 178D8A0005 for ; Sun, 7 May 2023 23:46:46 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=eKUx+Pn0; spf=none (imf15.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.115) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1683503207; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KoKwMAhZFAmJwxiMvcgiTAY8oHb4/KjGuOEPAN481k8=; b=TmySxyUawSMi7f1XMeCrufJXLYnsyXwz5LjTXfeqs5QOrex8sen1UeMXN1Pw77n47Wr2y0 4t8dCEn/1xRZbFtO3Rz2DVNgd4aM+uekwcL/82QUBn/nLieXZ8bBTQuFjN0mjwN1r67WV8 cNNHZzUKDsgz+SB60RnKLD4jMci5IDQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1683503207; a=rsa-sha256; cv=none; b=u+06PifXzSBkVk8jp1AmAVRyiqCxNlFUihOH40qzXgTZ6VWrE5sgyfwNaCXpN4YFd9P5Gk 7bPFCKJVC/EFhr6A42lfPSrk5vPfs77rExjMbNZyhEyDQAjJq8KX2k7lbpLn/9zAoi2E+W HoOBqHZK4vM6iaPvEd21Jn6odNt+eVQ= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=eKUx+Pn0; spf=none (imf15.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.115) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683503207; x=1715039207; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9TDbO3a4jRT5J+8rG9pcod0nyB+3DpMcKz9PP7HQwro=; b=eKUx+Pn09w8DHmmLtfJtGLJRpjowwmOQQ9OI1lkQTnJeUb/YBXTRfz4Y vrSps6Nf1ODtcOUe6wIsBJo7kLs2yIMINKcDCuO1UraflQ6IV02V94VoW nWPWrekzIvYPDDNeKtPv/aRU54LC2UVkN+2TOVnP7hlHXWRm9oC5P7/eu N+lv+vmK2r6m5NpTkA7iYS3a2eoKfZ7WvcxWcoqbO3S9CB6Ip2yEcGmeq z8FBsLnM9iNJgyvm5L94td1IRYZPVvoZOaQW3xt/c+eXv9VSRbGSMx88j sQYqddRpOghaPbpTd1nuJQbEu+maOazJGxh5DrOf1fraDzsv8xy3qyHjI Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10703"; a="349550306" X-IronPort-AV: E=Sophos;i="5.99,258,1677571200"; d="scan'208";a="349550306" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 May 2023 16:46:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10703"; a="701222812" X-IronPort-AV: E=Sophos;i="5.99,258,1677571200"; d="scan'208";a="701222812" Received: from dancaspi-mobl1.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.61.73]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 May 2023 16:46:32 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 5EDD010D349; Mon, 8 May 2023 02:46:20 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Dave Hansen Subject: [PATCHv10 08/11] x86/mm: Avoid load_unaligned_zeropad() stepping into unaccepted memory Date: Mon, 8 May 2023 02:46:15 +0300 Message-Id: <20230507234618.18067-9-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230507234618.18067-1-kirill.shutemov@linux.intel.com> References: <20230507234618.18067-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 178D8A0005 X-Rspam-User: X-Rspamd-Server: rspam06 X-Stat-Signature: 5k64qmncacz5yticbjmixxthyh7kkhqc X-HE-Tag: 1683503206-496944 X-HE-Meta: U2FsdGVkX18RBY/voj128cnrozeWQwHquvAckWFZIn6P5sjD+GgVMqw3Y5dKV/rv//MX8ygLxGHpzm+WIPhqC8V0US7Q4ZemhZ2CqPYqP6AJPV1rxLL6SRzPz1r8B0glaeiL1QfefYgS6/Ce6PlslId4teGYAc6mRzVtRo96t9WKYz0WMYnAZ67bDAThp6ZIvfcp6GpJSnmWbFUtAJIntwcWCCTRUFq1vmJcLxoAogcHElS8SUpGjr60QGtfkMLFbaSr2OK+SkvqekfxddmPZRlSHWpCE0fjiH+ULrp3VP1BelG+N4iHrThYLL3GAvfxby4LxZSG04kHzzWn9Mwh+gOBFFfj6RykPHrtepTorX5QPslmyOhONnKCG5aJoITKqwRZDPBc0z2SWR47bJz3e9VoMLq01QlIauJTL1LG/yd4lPu29H7azug29Uu87TR+b8yKEq65J/CBrvMGIJb7Xjd2gqpYHOmNZlFX0BiJufGEe6rSL5gwZc09vOgrHGRxPp7Pq7kDOKpxLtlkCSy5sgqt92kEexeWAACRFhlWLnGZt6NWnqt+/BsUBTqJuAaQqWnYuEQuiNHKRIUPSzQCbN1AO91/YdABPJlru0hYlavEUesM+D32f/cELV2l0/Ga0Bzz2LzCLQ/uhjjMUncyS9CB/HvX2mnYg8F9yqmkWWEkbmLPJ7Ul9oXBDffyhT0eWcX0yvlQQ75LMVmWDzaMEvR5nGktVLg7GQ9dMziYBK6WABMNuVKXD4p2r9I4M2+q8xEgjpwaxEFP6xiTZnNmE7WLewhX71DlPh5DmSIYkeUhFBblvEbVwkYbT86EzxiCRj/TUnb8hA7p3ns425Pc0E10/ZYQ7+2LTbnkXdrMyqq1xcUU1wCzMJbOVrXUs1rYib/zVro8zJCkb5K9SRijHF3mdRk/YyFnOiArsOblAEz7fWMmSzJDUQkhRURZYlrlYze28sS9KVF2X0j4iyb ZjMi+0j9 QJ3VJAGUbKfwkT9eMu9XTpkjqXZDJkObVSEltvlH7CBSw51ji8XeTUIpqhxgivRuXPCPkQz26ZRhCPEDOqAEBjgsHNUESpjvIc7L422e9Ukn5hpsDJLEfhO4Z9vb/Sd3EVQrD8UnFZ8ZiPLfwcbQgA88igd+WyTb4f1Pi8jb5GctyE5ZKTp+/qOYTmswM71MGom0y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: load_unaligned_zeropad() can lead to unwanted loads across page boundaries. The unwanted loads are typically harmless. But, they might be made to totally unrelated or even unmapped memory. load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now #VE) to recover from these unwanted loads. But, this approach does not work for unaccepted memory. For TDX, a load from unaccepted memory will not lead to a recoverable exception within the guest. The guest will exit to the VMM where the only recourse is to terminate the guest. There are two parts to fix this issue and comprehensively avoid access to unaccepted memory. Together these ensure that an extra "guard" page is accepted in addition to the memory that needs to be used. 1. Implicitly extend the range_contains_unaccepted_memory(start, end) checks up to end+2M if 'end' is aligned on a 2M boundary. It may require checking 2M chunk beyond end of RAM. The bitmap allocation is modified to accommodate this. 2. Implicitly extend accept_memory(start, end) to end+2M if 'end' is aligned on a 2M boundary. Side note: This leads to something strange. Pages which were accepted at boot, marked by the firmware as accepted and will never _need_ to be accepted might be on unaccepted_pages list This is a cue to ensure that the next page is accepted before 'page' can be used. This is an actual, real-world problem which was discovered during TDX testing. Signed-off-by: Kirill A. Shutemov Reviewed-by: Dave Hansen --- arch/x86/mm/unaccepted_memory.c | 33 +++++++++++++++++++++++++ drivers/firmware/efi/libstub/x86-stub.c | 7 ++++++ 2 files changed, 40 insertions(+) diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c index 1df918b21469..2f38059e5b08 100644 --- a/arch/x86/mm/unaccepted_memory.c +++ b/arch/x86/mm/unaccepted_memory.c @@ -23,6 +23,32 @@ void accept_memory(phys_addr_t start, phys_addr_t end) bitmap = __va(boot_params.unaccepted_memory); range_start = start / PMD_SIZE; + /* + * load_unaligned_zeropad() can lead to unwanted loads across page + * boundaries. The unwanted loads are typically harmless. But, they + * might be made to totally unrelated or even unmapped memory. + * load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now + * #VE) to recover from these unwanted loads. + * + * But, this approach does not work for unaccepted memory. For TDX, a + * load from unaccepted memory will not lead to a recoverable exception + * within the guest. The guest will exit to the VMM where the only + * recourse is to terminate the guest. + * + * There are two parts to fix this issue and comprehensively avoid + * access to unaccepted memory. Together these ensure that an extra + * "guard" page is accepted in addition to the memory that needs to be + * used: + * + * 1. Implicitly extend the range_contains_unaccepted_memory(start, end) + * checks up to end+2M if 'end' is aligned on a 2M boundary. + * + * 2. Implicitly extend accept_memory(start, end) to end+2M if 'end' is + * aligned on a 2M boundary. (immediately following this comment) + */ + if (!(end % PMD_SIZE)) + end += PMD_SIZE; + spin_lock_irqsave(&unaccepted_memory_lock, flags); for_each_set_bitrange_from(range_start, range_end, bitmap, DIV_ROUND_UP(end, PMD_SIZE)) { @@ -46,6 +72,13 @@ bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) bitmap = __va(boot_params.unaccepted_memory); + /* + * Also consider the unaccepted state of the *next* page. See fix #1 in + * the comment on load_unaligned_zeropad() in accept_memory(). + */ + if (!(end % PMD_SIZE)) + end += PMD_SIZE; + spin_lock_irqsave(&unaccepted_memory_lock, flags); while (start < end) { if (test_bit(start / PMD_SIZE, bitmap)) { diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index 1643ddbde249..1afe7b5b02e1 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -715,6 +715,13 @@ static efi_status_t allocate_unaccepted_bitmap(struct boot_params *params, return EFI_SUCCESS; } + /* + * range_contains_unaccepted_memory() may need to check one 2M chunk + * beyond the end of RAM to deal with load_unaligned_zeropad(). Make + * sure that the bitmap is large enough handle it. + */ + max_addr += PMD_SIZE; + /* * If unaccepted memory is present, allocate a bitmap to track what * memory has to be accepted before access.