From patchwork Tue Jan 16 02:20:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Kelley X-Patchwork-Id: 13520330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C89FC3DA79 for ; Tue, 16 Jan 2024 02:20:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C18F66B007E; Mon, 15 Jan 2024 21:20:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B99CB6B007D; Mon, 15 Jan 2024 21:20:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A40826B007B; Mon, 15 Jan 2024 21:20:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 93AF56B0075 for ; Mon, 15 Jan 2024 21:20:33 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 34ABB160510 for ; Tue, 16 Jan 2024 02:20:33 +0000 (UTC) X-FDA: 81683570346.24.4D32BBD Received: from mail-pg1-f172.google.com (mail-pg1-f172.google.com [209.85.215.172]) by imf06.hostedemail.com (Postfix) with ESMTP id 64B4718000F for ; Tue, 16 Jan 2024 02:20:30 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gJO4sP57; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf06.hostedemail.com: domain of mhkelley58@gmail.com designates 209.85.215.172 as permitted sender) smtp.mailfrom=mhkelley58@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705371630; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=f7Z1mtPltzcSBEq804G8TTNx6jfdkFZ1S3I87AVOGno=; b=oGrCoF31QdIjDnhv2HeFVDzBvztkHbiTkvAVCqMGjmYQ/P+yW47j2FInCKSO+giL/eg58J ZxX8t2Z4KFBccV84dGxJXe/ycJKe30MBIYXL+rIFTQEDkGG+bU7a0xUpRH1FoX2e+Gax9+ 8XvKd2wcHBi1126D03BZvpH3h/CFCYw= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gJO4sP57; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf06.hostedemail.com: domain of mhkelley58@gmail.com designates 209.85.215.172 as permitted sender) smtp.mailfrom=mhkelley58@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705371630; a=rsa-sha256; cv=none; b=i/mNapDgagUHZPZlG5NdpV7FhrP4zDAH1TkLZA/HdVUcoEaW+V9MCIN1DTYPSoyS525i8G +u7jBkGfG/aGkM53L6pkwS2I5+vETvxcFoJli1ej1doGos0OPFmIZpLg9EU9ebs/JL3yWy 1JNMDPIlOIUkSl3ZUIj+mw0ZydqYlr4= Received: by mail-pg1-f172.google.com with SMTP id 41be03b00d2f7-5ce0efd60ddso4768278a12.0 for ; Mon, 15 Jan 2024 18:20:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705371629; x=1705976429; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:to:from:from:to:cc:subject:date:message-id:reply-to; bh=f7Z1mtPltzcSBEq804G8TTNx6jfdkFZ1S3I87AVOGno=; b=gJO4sP57xvd3S0v5JhqL3xxAhM7M+/8Mo0qq0vO7RfICoKL1Gzmlhyo6D39gWI1jP1 0ZGg79p2LjFefNXXKdYxOZw5RvhYliScKibxjTDfAsaw7Qec7C7Vrsan9OkybFJB41Ra ppES9JPWAv1dDmhxe6omUlwCkDuDyiEa+z7JOdFxH30I6r9aPbZ9LGuahuMc6B5K6aG7 xHDSr1dwltcg+Pf5H60We7JGbGBHZ2J819dr2ARfJdlkudmY/20yHbZrC61YbFoPg+OI JX4uHTnqkRi+0eIBrcbrlMmlFtTFTyhD6wqZo6kmTv3kJ+TUa2Fe6leTEYCEgxTTOr7S mYBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705371629; x=1705976429; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=f7Z1mtPltzcSBEq804G8TTNx6jfdkFZ1S3I87AVOGno=; b=F706/6utYC0tj1xqjiUWYej6uMj3zrSIl2gvCZoMLb2bA/ponJOE6OSxNtIhfIOc1i iNMQB6jF3m6i0sXpYf3Em3SP24sm8rdE+3vl6o3RTTjDqnP7FyKUOuM31aW8KTi296Bg YcP8oQ7TRsIgNreUCok+p2+h6joTanbw5sKwLCeTh1pcQSFPn1z67wlvAuRu6bS/AmI3 EkYVbCqivQC4EDu44s1lCXB4x/rSMPdJGYWKqNXhF7XTwaxdFIKV5j3G9vA1X3aBU64p 8hXT1+iXErSeMVL4vz66spWJA/NMvfk21OpyiM8CAj+fi/VX/FPeGn0/qBYVGBjkpRgt KG5g== X-Gm-Message-State: AOJu0YxgmnbRFvEmk9ESpYhYgkzBi2XwETwvMfU9IE+rgJ6OPcdKPBqO VdlsA4mULxWPJntAdob01HA= X-Google-Smtp-Source: AGHT+IFpEHkY0yoYeAhv8uuFIshSZ9ztV2gY9fMSrDRyVFa67Pg8mi4Mlid+vf3l1tJRX6tt5cPILg== X-Received: by 2002:a05:6a21:9181:b0:19a:d952:e21e with SMTP id tp1-20020a056a21918100b0019ad952e21emr1489772pzb.50.1705371629082; Mon, 15 Jan 2024 18:20:29 -0800 (PST) Received: from localhost.localdomain (c-73-254-87-52.hsd1.wa.comcast.net. [73.254.87.52]) by smtp.gmail.com with ESMTPSA id kn14-20020a170903078e00b001d1d1ef8be5sm8193379plb.173.2024.01.15.18.20.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 18:20:28 -0800 (PST) From: mhkelley58@gmail.com X-Google-Original-From: mhklinux@outlook.com To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, kirill.shutemov@linux.intel.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, luto@kernel.org, peterz@infradead.org, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, thomas.lendacky@amd.com, ardb@kernel.org, jroedel@suse.de, seanjc@google.com, rick.p.edgecombe@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev, linux-hyperv@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v4 0/3] x86/hyperv: Mark CoCo VM pages not present when changing encrypted state Date: Mon, 15 Jan 2024 18:20:05 -0800 Message-Id: <20240116022008.1023398-1-mhklinux@outlook.com> X-Mailer: git-send-email 2.25.1 Reply-To: mhklinux@outlook.com MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 64B4718000F X-Stat-Signature: swoykw9imjm7ucun5fm7tadf98d8mnpi X-HE-Tag: 1705371630-693679 X-HE-Meta: U2FsdGVkX18Lw3vLkgpqICUMjBm7+cAV1S0JQqUBx2a+DKbrvQpGVpxzyhDzjV/K08IGMri0YmcELjlEE2zZgIW3B/JeGw9aGzYpi3Mas8LZh8MjmkEYvZZE8WKobvghChAeiAP7T29Xlff58x32bK42maQo1Mj4Pu1eqiuKnAnaxkw7Y2DLWlsNIirLchXkmByWNIxKauO2DJq6/5rJ6GLcFfWbPZtJDWOuaDm3glAWmLlL9fEIcsJpCD5B3fpPkQOGT0RcBOKMc1pWCwMhms2ZjsnKVfmBp6klSiSZcycUTc3/4qgWLy1jS59LJlkHtP8Q9zkeuYSwkHVz6dl26nd1//WuQzSXdYIXyuC8INmdrwTjanlhxMK+fLpQmMi715zqAuGKPdV89DvGDgYSRbEwx0WLsc6tNNbVncLIvtPrnvF0bPMH2IUt/zadLZf7u8aNLiBpfcaV77sSUKwBzG+/emrMcGjLeLBb0iZvkFcfzxiibtxf6uW83WB5dL3j9LwXFfNSaqaTAaTP9ThVW+v4sH4OnavaXR8TaIId9MNT5J1f9S6K5JgeA2Hm7lg5EcLo3MkP4+udW1NxSXi2B6w+9T4PxSUScE9AiFEX5e+Wp4pILqZH3K4XyGsfDOLcV2kctl4vlFXm7WR8yV/q8TPmpcvSlbOf18dY3g+LqJRSHYIJU3X0soF4F6+puMb6YHN3lwFTT40NrZ732VmgDUs1JeThjsp73gpXhfSvN5hKyWZm/LMpgu+aHB+xSA9SEFueRYntxyNDP8qOcmBGmnX2Eia1JRwS17tXENxtEX2KddgTXKROuSUy1HRQx3dZwY+z9/C+asfEWmddW4OhzzY3OwJ8df9hx0H//H2WeqAQXQtefrCzyjZWpGC+3Hxsqu1/sr3ZiLY6E1OQCcFbEw8w/WXE3+JlPg/Pl3oc8VJVQ3M+cmfNnhWvk99q0UlQd72wSXdZwJkBfzAYR0A blWT6ndo DzNhsnl0jzkNlcnBqxbwdU9vu3FpTEXl8BmjFYv9y1g712NxEujar2fxSOoc0HqOvFtBBiZ1U5k0pf4cHJIjoH4LO1VWBNV554TfOrIFMA3SnA8QkH1NIzmzRZ5o8vdpSqttkwJDBcObmKX108ldAdeHSltZpa7lVYWMXM6wjh7y0s8+9TeaSg0AzOAGSb4Zy00l19alOdIGqi5G5u3e9z27gn7N9SV3zmSikBZfiEWnBFRG1x5A4Q2/K2+SqiEMc23lYgGFc5u1JRRPY5TL49JqvM3uGAANxT7BDTOLXR4URdgXScnGeXV7HRcW9H+4dFvFpuQ2AGcwQUfB/b+DZ3onbcxWqmgD9mdIRc/HZLiUyZMdVOlxxv7cU/xbe9kvqP8e2WCLjwocqs1wKfJI8KJ919GlGK1TVj8JbqRNfYBRDEym2Q+0UR5XsAOd4DmujL9bSUbiryYMnzudYE4Mwpl95BCr9YTJvfvs27rhFoSsFlMUyPSXnK+nfGM+aXIVrOLD8HlIOv2FEPjQj55UsrZFPugtI8s7w7udP2B9ULuzBi2K5Uy/IcoxlALxes+TL6z1vuAJlEFuiPg9D7sTzhU69SwfTPGiQz0cPPenonw+lUg4BZ/CwknJzTbMBeCYzwofnnuJEWT717cg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Michael Kelley In a CoCo VM, when transitioning memory from encrypted to decrypted, or vice versa, the caller of set_memory_encrypted() or set_memory_decrypted() is responsible for ensuring the memory isn't in use and isn't referenced while the transition is in progress. The transition has multiple steps, and the memory is in an inconsistent state until all steps are complete. A reference while the state is inconsistent could result in an exception that can't be cleanly fixed up. However, the kernel load_unaligned_zeropad() mechanism could cause a stray reference that can't be prevented by the caller of set_memory_encrypted() or set_memory_decrypted(), so there's specific code to handle this case. But a CoCo VM running on Hyper-V may be configured to run with a paravisor, with the #VC or #VE exception routed to the paravisor. There's no architectural way to forward the exceptions back to the guest kernel, and in such a case, the load_unaligned_zeropad() specific code doesn't work. To avoid this problem, mark pages as "not present" while a transition is in progress. If load_unaligned_zeropad() causes a stray reference, a normal page fault is generated instead of #VC or #VE, and the page-fault-based fixup handlers for load_unaligned_zeropad() resolve the reference. When the encrypted/decrypted transition is complete, mark the pages as "present" again. This version of the patch series marks transitioning pages "not present" only when running as a Hyper-V guest with a paravisor. Previous versions[1] marked transitioning pages "not present" regardless of the hypervisor and regardless of whether a paravisor is in use. That more general use had the benefit of decoupling the load_unaligned_zeropad() fixup from CoCo VM #VE and #VC exception handling. But the implementation was problematic for SEV-SNP because the SEV-SNP hypervisor callbacks require a valid virtual address, not a physical address like with TDX and the Hyper-V paravisor. Marking the transitioning pages "not present" causes the virtual address to not be valid, and the PVALIDATE instruction in the SEV-SNP callback fails. Constructing a temporary virtual address for this purpose is slower and adds complexity that negates the benefits of the more general use. So this version narrows the applicability of the approach to just where it is required because of the #VC and #VE exceptions being routed to a paravisor. The previous version minimized the TLB flushing done during page transitions between encrypted and decrypted. Because this version marks the pages "not present" in hypervisor specific callbacks and not in __set_memory_enc_pgtable(), doing such optimization is more difficult to coordinate. But the page transitions are not a hot path, so this version eschews optimization of TLB flushing in favor of simplicity. Since this version no longer touches __set_memory_enc_pgtable(), I've also removed patches that add comments about error handling in that function. Rick Edgecombe has proposed patches to improve that error handling, and I'll leave those comments to Rick's patches. Patch 1 handles implications of the hypervisor callbacks needing to do virt-to-phys translations on pages that are temporarily marked not present. Patch 2 makes the existing set_memory_p() function available for use in the hypervisor callbacks. Patch 3 is the core change that marks the transitioning pages as not present. This patch set is based on the linux-next20240103 code tree. Changes in v4: * Patch 1: Updated comment in slow_virt_to_phys() to reduce the likelihood of the comment becoming stale. The new comment describes the requirement to work with leaf PTE not present, but doesn't directly reference the CoCo hypervisor callbacks. [Rick Edgecombe] * Patch 1: Decomposed a complex line-wrapped statement into multiple statements for ease of understanding. No functional change compared with v3. [Kirill Shutemov] * Patch 3: Fixed handling of memory allocation errors. [Rick Edgecombe] Changes in v3: * Major rework and simplification per discussion above. Changes in v2: * Added Patches 3 and 4 to deal with the failure on SEV-SNP [Tom Lendacky] * Split the main change into two separate patches (Patch 5 and Patch 6) to improve reviewability and to offer the option of retaining both hypervisor callbacks. * Patch 5 moves set_memory_p() out of an #ifdef CONFIG_X86_64 so that the code builds correctly for 32-bit, even though it is never executed for 32-bit [reported by kernel test robot] [1] https://lore.kernel.org/lkml/20231121212016.1154303-1-mhklinux@outlook.com/ Michael Kelley (3): x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback x86/mm: Regularize set_memory_p() parameters and make non-static x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad() arch/x86/hyperv/ivm.c | 65 ++++++++++++++++++++++++++++--- arch/x86/include/asm/set_memory.h | 1 + arch/x86/mm/pat/set_memory.c | 24 +++++++----- 3 files changed, 75 insertions(+), 15 deletions(-)