From patchwork Wed May 8 13:25:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Stecklina X-Patchwork-Id: 13658755 Received: from DEU01-BE0-obe.outbound.protection.outlook.com (mail-be0deu01on2138.outbound.protection.outlook.com [40.107.127.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 450696E5FE; Wed, 8 May 2024 13:25:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.127.138 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715174733; cv=fail; b=o+AAqBq0xl08Qlso7SVToZJ381vukhLi4zzMZsA4AH1dNtQQkcjwMya9lbWB5VODOx1dAG2N29x5cjQs588EKuCq27L1oF3h5ZOJMBOr8wRdeix8tehJpAEH4XMmnFu00vKKISfQd/lxznD+XdDjMAL23WD49NfySSoOyIL38w4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715174733; c=relaxed/simple; bh=5CgWbz/0tjPl8chy+2VXaGvySyVKSEU74VOqQ2DBv30=; h=From:To:Cc:Subject:Date:Message-ID:Content-Type:MIME-Version; b=kcd87GyIN9QyFRrQ6f89cVXxCreKJRM+Ha8qpQDVBDYhzvR9S5MxV6WzSMRU9FcgR0oa8ywrteYWjaC3Zo9ypKL9Pn5H8iK300nwK+uiGQxRfskXcdO2PlxWQWFFuiBIDoWJabYKu6HSJPcmefL0IWYKnhN7N0R4k+8vCQFRDiU= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cyberus-technology.de; spf=pass smtp.mailfrom=cyberus-technology.de; dkim=pass (2048-bit key) header.d=cyberus-technology.de header.i=@cyberus-technology.de header.b=URx3yHuK; arc=fail smtp.client-ip=40.107.127.138 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cyberus-technology.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyberus-technology.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyberus-technology.de header.i=@cyberus-technology.de header.b="URx3yHuK" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZK0Bqs5vciz3x1bP7lk/PyIET8i3eDeVMztfuMvFo6RiE5SMRGRmos5j7YNk9BkPf58xQg0EgAJhslUpxZb+5//rVVQoAhEsOHXQFuzhtQ5ml1oIKIQxIF14iWBY2pwFiMaQXPMTU5k0A4LxSqfgXqUYnzG4lB6W3NEfuosx7v9J/DKpAOIb8zRiS1kfSA8j+0SxudKpKGdMr4weIDhgb+lvRB0yyzOKGTlNkdxMcSuaEjAeCUNSfn5R569TvVto1N5+9kXfe/jm+BgyiFwnTtEH0TwjWeGiKTZtEvRYsvWTVhFTc5T8AiACPqSxzdVrv+aZRHi4r/xhSkE/Vwpwng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qgwgEhVqqg2Q/zoj2y9pbbQ6twB5Hk34jv8tGlwQY3E=; b=TtWnyazgp0NXXte6hufylErRhyvTnGfMIm6mTBTZh5UTHuCQpEEP401GhXJrokQt/J65/VpXy9AHTOKMQyIYsYYPsmv/F64JLYIfoOkaYErs9HgP39/6gV4JlCOvkEg6eBcn+lIwDfHluHI1uvcvqmvuWVQWxT+HKn9/5VeQI62jcPs4nYhgzLh6AaokVLjm4rDGvvsDpNkqqSlUK5Z7YxRnFpC/s1P0OBxjFBWpgJuEtC0KM3MNllifcnmBQbkC20wY7Zcm7LokrSX30QIjUtOVBctuWog0CXe9hKugwBoHMVZl+ZkSO3o9ymtnUJERrqCQk82CALn9ek+3FzT0hg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=cyberus-technology.de; dmarc=pass action=none header.from=cyberus-technology.de; dkim=pass header.d=cyberus-technology.de; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyberus-technology.de; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qgwgEhVqqg2Q/zoj2y9pbbQ6twB5Hk34jv8tGlwQY3E=; b=URx3yHuKf9a94S8A2ix19GXcjs1E6JNcJLw1dtXAsCtelGiT4yn4MihDTOTxUiYpQ9MGE/x7r46BIkofjd+xtpRBimESzRPer2TJ04tIJVKZQVJEuYcVF8a6xWMGHdYE6KWeQ/CFpNDPQmDUe9nlew2/AvUL1KtnR7cMSAD8AqDMY1hrD4XeChBpCgexzRufxGr6OqHDMUx+CyJW9WBGML4j1Tgt1ULykCGyhYv58c61anQ+YTI82Z4DsrC8WOoGLcdPpVjK8ficzakQG+Bmsn6RZL5vYheSkWNOHwcoobPp6geBiUh49W2MThJsB0Mk9sOvy3EF3V4Rk1fI8v6o7g== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=cyberus-technology.de; Received: from FR2P281MB2329.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:38::7) by FR5P281MB4522.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:11f::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7544.45; Wed, 8 May 2024 13:25:27 +0000 Received: from FR2P281MB2329.DEUP281.PROD.OUTLOOK.COM ([fe80::bf0d:16fc:a18c:c423]) by FR2P281MB2329.DEUP281.PROD.OUTLOOK.COM ([fe80::bf0d:16fc:a18c:c423%5]) with mapi id 15.20.7544.041; Wed, 8 May 2024 13:25:27 +0000 From: Julian Stecklina To: Paolo Bonzini , Jonathan Corbet , Sean Christopherson , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Cc: Thomas Prescher , Julian Stecklina , kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] KVM: x86: add KVM_RUN_X86_GUEST_MODE kvm_run flag Date: Wed, 8 May 2024 15:25:01 +0200 Message-ID: <20240508132502.184428-1-julian.stecklina@cyberus-technology.de> X-Mailer: git-send-email 2.44.0 Reply-To: X-ClientProxiedBy: MA2P292CA0015.ESPP292.PROD.OUTLOOK.COM (2603:10a6:250:1::12) To FR2P281MB2329.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:38::7) Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: FR2P281MB2329:EE_|FR5P281MB4522:EE_ X-MS-Office365-Filtering-Correlation-Id: 07dcb821-65b5-449e-fece-08dc6f625338 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230031|1800799015|52116005|7416005|376005|366007|38350700005; X-Microsoft-Antispam-Message-Info: 9sY2o3Doy3MqH82b7EfiPdbqsZ8qJtvFclEEgq7LZzv2GYdNd5PMbkNvSc7xgvZlYChUFM8s/jRyXEeyFs3rzhrNUzgNl6NVYlDkW/J/wH//GElERURgb+xFJ99apbonVjovb0D0IKXPSWMv2lRYnIPktK9UeM86szdZ55/BERquGa92BCBRxBqEQsePwdiySMkB64EUO1c2g1QJ1EurjpuCuu9yJMLmlqYD7Ta3tLo4XunQsLf95j6KellCdKg20rPC1+rGshq4lJnVurfTseTgjZrAMzioSdc6Zxgno1u2ErvmvVNExemuYGIRx0NxVE+M/urIrL17dGOteqMAKQQQMdUYb97qwYLixzf6wZLQj+154VioWjBhpN8tzgWhqroJDHbSLNAv3Y/P5Qy6HPEvX3cBXnZyXnW5LzWyhwTgX+o5TVHkiCvLHtS55D/tzPd/DSVw/WqO1Hdqdh/14qYfhyyu0OVZ99mRZqJcbV8TK6G4q8DeZVWsj8XTgJ1aMrwcB/8iUzUOQOt9FE8Y/tGbI1Fd2nhMHnItu1bGoONWLtTQoODiqNHDSQakALQiHQMav3Z440kFkxj74kGB5NTnwNj3fC91Dw7Y5jaWSwyQYVjX7Bnjjgb8hcp3WX+MSNDVMNAKvALge/J2eymJvzQ0UIFbW+4UuVnjevtspnyc1d3w+ssKbOpMRO0cpZTnFOv9M+5+J+5yHXG3cQr0+q6BFLhjEOum+Ujgt58HVBjVbis00aYbR4foGWClT1f6ihIIpSSWTmUYSqbKmyCvAkDl7Qbcc6khiEgShVUCM/jiF9lTy3i474R6nFCiq4YjhCc5d7bJV52ltQz+Yl7XLV4ZxLWHkLTrj59E0XORrClVB+QCsLF6keu8FMuXWUXkStxpEfO021z7KuWi+cfCrKZQ22sNm8hBvhecHqYyf2zZ7B1uFLM8PxWxuMJs364gjRj/GxchMH7WyEwwKyuq32olEAQ6PLk2w6B3W+5pUCe4sZO4zcf/WD/hOk8dG2fa4m/4OH2Empcm5AnMiMnekYjU1TRm4l1d0a+WC4cZrJHomJrbaxq6jY3NYZyfNJJ+5nZoYjli2c/n7fUqIIX6GN6xLowYG8sYR31at6+IS23cm8t5Rcyk2/NFH4CRatVY07R1PtPz3JNOS4O9EKMDqaoOFRxfQEV0ADIw/oFGzBwKUxIAAUbhvFNfCYCIfmEDusG1+xsdj4en/VY8l4xyWqsyjWBW27DtOG+5oPz6vhioPESEFBB7eMHEoFRAnw3aBd+XLEvm7L8i5jmMdNV6gA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:FR2P281MB2329.DEUP281.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230031)(1800799015)(52116005)(7416005)(376005)(366007)(38350700005);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: ZLwhE37KXhY0JdZaSdAD7qQJ0XklPZrMkamMJRgKOahDe+aowE9y5VD0dtrjfCDFVOCiHBVaDGdIMabBOMfbaQF1JkkvD/0gicNhagMs8qYPw9RhRYWaqvDZNjpsbIkvG5P9D6iRbkbF7Jyznqy7gRZCLJB273aSsiQKAhFWuWXWXJgx8CethDlsHkIzYFvUWEN7mliRvXIU3HQLKB7m4OZ8Y4UIazWaUwyj/HfEhoUZnyask5FIDYpG2fGG2jTKXz++h0XWV+vwUt6m2ZZv+HUk3NqgszqXFBrXmE4583xuj9Y7bRMHkj5TIy9JRAR97fbcgy6FvqTIfgcuUAk7JA30QzAMmnG+0sODABeURYd2czIzsd/EBbsRWfbneIM40RSNSKzZYEgyyxB2K8SYuie3DB//LzT2u71KI/tlhu0UfJBW3o515+/0VVx62pwfvYHq6gVAZzl2HRAa6RMklnjsEsCD5jync+/gfMw6kMljZOEI2FykSjg1T5I+fg9bbQZoS1ANzDnm8zld7UVCbD4FZg1JjmojlEThOhg3XR6rPyqU0/g9+yKnFvITFzuEYQ8ACyQm3JHHr0NpHRV/qrIj/1p45c4TaTA2AXURRdOv89LwElnpIZylUWdnSJ5gcgmeWznLWRo1qZWeDHulzEKtdowM36fI3WV/QaUKO3lW9qWf8+Aam63u9gdlQL+/yAh2OyasTUxF62mjHEVGYNfxcS3OehV6hoGcYSeCm6ttWYvH7eEff6JGOHJoJxX/8K/3kGLuLPGzA8bLvIZctkKBTjiGrMMX4OSLa9HCN/DD8mvokBKYT1GNhzQ6MH4C6sPTIGT0ZrJPLA2n/tbIsfTsk7YqDDONaxkl5JOks1/Itdbxce3aSQQiVlcKRLFWcLPHWf3QTzt8DpcpC5AONCCGDzB87AsXWYRG6TFfWbufLOcX1qS4+zOzyajLARW2RXu+AIYEvvCyL0u1lq0UfLsue9dvkOTEOtSTvxTWmPSxPoz1FHJRchZPH/rY1DCKGQgDnxAu4qvGStVz+Yakf+czxQQ4FgZ1hk8EgmOGGWIA6f2XMUMpsiGE6oASgbD/ENMzqJ6wfznUSK5/0P1+JxDYkpE55BdJ8uICVGXTq+3xb29akPE/5MDDeCMUksMwvNbaFM+2NGEI2BU7Gxn6Lm64zDW9DMWMZwAJ15xewwFbbu2Km2jfllbMZlCANUWajP88kKC1ZSqdDWGtUf+Zdk9FwmrDlZh2XmcxPDA/9o9ukBGsZ6lT4npCgBdGeQd+rMXWB2dckaa3ruFFbTAtEqLDpvqm+w2uE/MtnJrcMr/r0xIn9oO772X/EWT9+3f1dMrrQdp+qY6K+ha7mnWv90t+1ZmAhWu4U0+GVJ58aAvKUFGE7KeXuHmisYkIRxYjMh20lukGybxwUBAp5qN0zWg6XyDiPhNKOY8DE1WMW+6NROYkGo0Gy4t5xgjO36W4O9OnuPtfsJAoKjswlvApwzBc2ZuoT76Vzn900TX7c8k7S1t+vqyqIiOtcCRi2+NfkQvgK3o10C0HGfumGqKI0bluZhsxz5EZGdpO/YyrLuBhGzDr8oloLEn2Eptj6z6xYGi5PgekifYS7Hn3cOa41JGGvhDcSKkVU3OIPBWOHsM= X-OriginatorOrg: cyberus-technology.de X-MS-Exchange-CrossTenant-Network-Message-Id: 07dcb821-65b5-449e-fece-08dc6f625338 X-MS-Exchange-CrossTenant-AuthSource: FR2P281MB2329.DEUP281.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 May 2024 13:25:27.5434 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f4e0f4e0-9d68-4bd6-a95b-0cba36dbac2e X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: dPG8q+gQyhlecBho6M4v1baOTfQHFa7tyr5Fncq1Aln5I9Wv3CAUvCxaG9u5cWxjOUWfEgi4Uz5yOa61zOoTxLEmO4I2SgbZ42U+Ap28HxDt6J4gxPha5mhx+diDdPPv X-MS-Exchange-Transport-CrossTenantHeadersStamped: FR5P281MB4522 From: Thomas Prescher When a vCPU is interrupted by a signal while running a nested guest, KVM will exit to userspace with L2 state. However, userspace has no way to know whether it sees L1 or L2 state (besides calling KVM_GET_STATS_FD, which does not have a stable ABI). This causes multiple problems: The simplest one is L2 state corruption when userspace marks the sregs as dirty. See this mailing list thread [1] for a complete discussion. Another problem is that if userspace decides to continue by emulating instructions, it will unknowingly emulate with L2 state as if L1 doesn't exist, which can be considered a weird guest escape. This patch introduces a new flag KVM_RUN_X86_GUEST_MODE in the kvm_run data structure, which is set when the vCPU exited while running a nested guest. Userspace can then handle this situation. To see whether this functionality is available, this patch also introduces a new capability KVM_CAP_X86_GUEST_MODE. [1] https://lore.kernel.org/kvm/20240416123558.212040-1-julian.stecklina@cyberus-technology.de/T/#m280aadcb2e10ae02c191a7dc4ed4b711a74b1f55 Signed-off-by: Thomas Prescher Signed-off-by: Julian Stecklina --- Documentation/virt/kvm/api.rst | 17 +++++++++++++++++ arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/x86.c | 3 +++ include/uapi/linux/kvm.h | 1 + 4 files changed, 22 insertions(+) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 0b5a33ee71ee..7748c3eb98e0 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6419,6 +6419,9 @@ affect the device's behavior. Current defined flags:: #define KVM_RUN_X86_SMM (1 << 0) /* x86, set if bus lock detected in VM */ #define KVM_RUN_BUS_LOCK (1 << 1) + /* x86, set if the VCPU exited from a nested (L2) guest */ + #define KVM_RUN_X86_GUEST_MODE (1 << 2) + /* arm64, set for KVM_EXIT_DEBUG */ #define KVM_DEBUG_ARCH_HSR_HIGH_VALID (1 << 0) @@ -8063,6 +8066,20 @@ error/annotated fault. See KVM_EXIT_MEMORY_FAULT for more information. +7.34 KVM_CAP_X86_GUEST_MODE +------------------------------ + +:Architectures: x86 +:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. + +The presence of this capability indicates that KVM_RUN will update the +KVM_RUN_X86_GUEST_MODE bit in kvm_run.flags to indicate whether the +vCPU was executing nested guest code when it exited. + +KVM exits with the register state of either the L1 or L2 guest +depending on which executed at the time of an exit. Userspace must +take care to differentiate between these cases. + 8. Other capabilities. ====================== diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index ef11aa4cab42..ff4ed82a2d06 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -106,6 +106,7 @@ struct kvm_ioapic_state { #define KVM_RUN_X86_SMM (1 << 0) #define KVM_RUN_X86_BUS_LOCK (1 << 1) +#define KVM_RUN_X86_GUEST_MODE (1 << 2) /* for KVM_GET_REGS and KVM_SET_REGS */ struct kvm_regs { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 91478b769af0..64f2cba9345e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4714,6 +4714,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: case KVM_CAP_IRQFD_RESAMPLE: case KVM_CAP_MEMORY_FAULT_INFO: + case KVM_CAP_X86_GUEST_MODE: r = 1; break; case KVM_CAP_EXIT_HYPERCALL: @@ -10200,6 +10201,8 @@ static void post_kvm_run_save(struct kvm_vcpu *vcpu) if (is_smm(vcpu)) kvm_run->flags |= KVM_RUN_X86_SMM; + if (is_guest_mode(vcpu)) + kvm_run->flags |= KVM_RUN_X86_GUEST_MODE; } static void update_cr8_intercept(struct kvm_vcpu *vcpu) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 2190adbe3002..ccb12f6a656d 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -917,6 +917,7 @@ struct kvm_enable_cap { #define KVM_CAP_MEMORY_ATTRIBUTES 233 #define KVM_CAP_GUEST_MEMFD 234 #define KVM_CAP_VM_TYPES 235 +#define KVM_CAP_X86_GUEST_MODE 236 struct kvm_irq_routing_irqchip { __u32 irqchip;