From patchwork Mon Nov 21 00:26:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Kai" X-Patchwork-Id: 13050205 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 990A0C4332F for ; Mon, 21 Nov 2022 00:27:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 230976B007B; Sun, 20 Nov 2022 19:27:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E13F6B007D; Sun, 20 Nov 2022 19:27:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0826E8E0001; Sun, 20 Nov 2022 19:27:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EC8646B007B for ; Sun, 20 Nov 2022 19:27:14 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B0680140690 for ; Mon, 21 Nov 2022 00:27:14 +0000 (UTC) X-FDA: 80155559988.12.5E0B384 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf22.hostedemail.com (Postfix) with ESMTP id 02244C000B for ; Mon, 21 Nov 2022 00:27:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668990433; x=1700526433; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3s5S0WEvIVlYJHz6MygetQEQmwV83ab1apvR2JvNNVM=; b=lqYLExaeKwmibIYaU0SOn1obcg06hEkRkXF1expDCwL3Pxg/poyEuGuC 6BSsWAYtSKEhA6epL5x5DD/pOFL5qXODVo8KnEPQkLz293AquYkF6LoKP Gxe3LXD/uAu6kHp4m7qnz7/VZTgjMlD9xmGQpwYbSHWqTsUkU+wf54ijF 5/YSc1+fEabukFBMw9kLC6mcBXfuT9Nlu7eG3xtm0nxTRFM/D5zuf1Qaa UsK9+gdx6C7YRz4H+FGgLJe2XHUNZ4BDXqiB/TUhBelb98YhNfC2dSKp4 oXdYcqtoaWcmub/YFAlVZIs05Tgd1PK+P/8AMsaKvdL8iSMDpPsFNm4Xq w==; X-IronPort-AV: E=McAfee;i="6500,9779,10537"; a="296803678" X-IronPort-AV: E=Sophos;i="5.96,180,1665471600"; d="scan'208";a="296803678" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2022 16:27:12 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10537"; a="729825228" X-IronPort-AV: E=Sophos;i="5.96,180,1665471600"; d="scan'208";a="729825228" Received: from tomnavar-mobl.amr.corp.intel.com (HELO khuang2-desk.gar.corp.intel.com) ([10.209.176.15]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2022 16:27:08 -0800 From: Kai Huang To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: linux-mm@kvack.org, seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, dan.j.williams@intel.com, rafael.j.wysocki@intel.com, kirill.shutemov@linux.intel.com, ying.huang@intel.com, reinette.chatre@intel.com, len.brown@intel.com, tony.luck@intel.com, peterz@infradead.org, ak@linux.intel.com, isaku.yamahata@intel.com, chao.gao@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, bagasdotme@gmail.com, sagis@google.com, imammedo@redhat.com, kai.huang@intel.com Subject: [PATCH v7 04/20] x86/virt/tdx: Add skeleton to initialize TDX on demand Date: Mon, 21 Nov 2022 13:26:26 +1300 Message-Id: X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668990433; a=rsa-sha256; cv=none; b=gJ3dXnawOw+hKRqG944l+T7p1GkCJCaC0TOI3waihT8ldssk5cyQdHhGVUH4p75OpmVuKu NtFHhTNlTGrOnZgxwIJOPGzTNrKIgL1bRmsRTiJnJISn7O06dBnEqGtq0UbxsTgSZV/LBa ic7EHkROYiC8S+s1k6kwqRXwYW+lKn0= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=lqYLExae; spf=pass (imf22.hostedemail.com: domain of kai.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=kai.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668990433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7ur5RcuQkWcjjwGZvulljBWYNnnq4yuWImrMgWC3aF4=; b=kwNvu42+XJf4bQSxC6mpuIDbU2MjGqMI7NSnO7zkgf3lxFsHux2z33L+txhOUt1uVBKtUp ODkRBhfYbRMJhvo9ZPJh6ELNpz5YRKj/xYUOJSis8SrLngYAz1pMlUPZk5tsmaba0Jox2c hzWpEXc/iRkgLHiPnQIRw6tFbASpde0= X-Rspam-User: X-Stat-Signature: ii41ddfmxesyyu8ut3ppu3snruyou1qa X-Rspamd-Queue-Id: 02244C000B Authentication-Results: imf22.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=lqYLExae; spf=pass (imf22.hostedemail.com: domain of kai.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=kai.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspamd-Server: rspam07 X-HE-Tag: 1668990432-33100 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Before the TDX module can be used to create and run TDX guests, it must be loaded and properly initialized. The TDX module is expected to be loaded by the BIOS, and to be initialized by the kernel. TDX introduces a new CPU mode: Secure Arbitration Mode (SEAM). The host kernel communicates with the TDX module via a new SEAMCALL instruction. The TDX module implements a set of SEAMCALL leaf functions to allow the host kernel to initialize it. The TDX module can be initialized only once in its lifetime. Instead of always initializing it at boot time, this implementation chooses an "on demand" approach to initialize TDX until there is a real need (e.g when requested by KVM). This approach has below pros: 1) It avoids consuming the memory that must be allocated by kernel and given to the TDX module as metadata (~1/256th of the TDX-usable memory), and also saves the CPU cycles of initializing the TDX module (and the metadata) when TDX is not used at all. 2) It is more flexible to support TDX module runtime updating in the future (after updating the TDX module, it needs to be initialized again). 3) It avoids having to do a "temporary" solution to handle VMXON in the core (non-KVM) kernel for now. This is because SEAMCALL requires CPU being in VMX operation (VMXON is done), but currently only KVM handles VMXON. Adding VMXON support to the core kernel isn't trivial. More importantly, from long-term a reference-based approach is likely needed in the core kernel as more kernel components are likely needed to support TDX as well. Allow KVM to initialize the TDX module avoids having to handle VMXON during kernel boot for now. Add a placeholder tdx_enable() to detect and initialize the TDX module on demand, with a state machine protected by mutex to support concurrent calls from multiple callers. The TDX module will be initialized in multi-steps defined by the TDX module: 1) Global initialization; 2) Logical-CPU scope initialization; 3) Enumerate the TDX module capabilities and platform configuration; 4) Configure the TDX module about TDX usable memory ranges and global KeyID information; 5) Package-scope configuration for the global KeyID; 6) Initialize usable memory ranges based on 4). The TDX module can also be shut down at any time during its lifetime. In case of any error during the initialization process, shut down the module. It's pointless to leave the module in any intermediate state during the initialization. Both logical CPU scope initialization and shutting down the TDX module require calling SEAMCALL on all boot-time present CPUs. For simplicity just temporarily disable CPU hotplug during the module initialization. Note TDX architecturally doesn't support physical CPU hot-add/removal. A non-buggy BIOS should never support ACPI CPU hot-add/removal. This implementation doesn't explicitly handle ACPI CPU hot-add/removal but depends on the BIOS to do the right thing. Reviewed-by: Chao Gao Signed-off-by: Kai Huang --- v6 -> v7: - No change. v5 -> v6: - Added code to set status to TDX_MODULE_NONE if TDX module is not loaded (Chao) - Added Chao's Reviewed-by. - Improved comments around cpus_read_lock(). - v3->v5 (no feedback on v4): - Removed the check that SEAMRR and TDX KeyID have been detected on all present cpus. - Removed tdx_detect(). - Added num_online_cpus() to MADT-enabled CPUs check within the CPU hotplug lock and return early with error message. - Improved dmesg printing for TDX module detection and initialization. --- arch/x86/include/asm/tdx.h | 2 + arch/x86/virt/vmx/tdx/tdx.c | 150 ++++++++++++++++++++++++++++++++++++ 2 files changed, 152 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 51c4222a13ae..05fc89d9742a 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -101,8 +101,10 @@ static inline long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, #ifdef CONFIG_INTEL_TDX_HOST bool platform_tdx_enabled(void); +int tdx_enable(void); #else /* !CONFIG_INTEL_TDX_HOST */ static inline bool platform_tdx_enabled(void) { return false; } +static inline int tdx_enable(void) { return -ENODEV; } #endif /* CONFIG_INTEL_TDX_HOST */ #endif /* !__ASSEMBLY__ */ diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 8d943bdc8335..28c187b8726f 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -10,15 +10,34 @@ #include #include #include +#include +#include +#include #include #include #include #include #include "tdx.h" +/* TDX module status during initialization */ +enum tdx_module_status_t { + /* TDX module hasn't been detected and initialized */ + TDX_MODULE_UNKNOWN, + /* TDX module is not loaded */ + TDX_MODULE_NONE, + /* TDX module is initialized */ + TDX_MODULE_INITIALIZED, + /* TDX module is shut down due to initialization error */ + TDX_MODULE_SHUTDOWN, +}; + static u32 tdx_keyid_start __ro_after_init; static u32 tdx_keyid_num __ro_after_init; +static enum tdx_module_status_t tdx_module_status; +/* Prevent concurrent attempts on TDX detection and initialization */ +static DEFINE_MUTEX(tdx_module_lock); + /* * Detect TDX private KeyIDs to see whether TDX has been enabled by the * BIOS. Both initializing the TDX module and running TDX guest require @@ -104,3 +123,134 @@ bool platform_tdx_enabled(void) { return !!tdx_keyid_num; } + +/* + * Detect and initialize the TDX module. + * + * Return -ENODEV when the TDX module is not loaded, 0 when it + * is successfully initialized, or other error when it fails to + * initialize. + */ +static int init_tdx_module(void) +{ + /* The TDX module hasn't been detected */ + return -ENODEV; +} + +static void shutdown_tdx_module(void) +{ + /* TODO: Shut down the TDX module */ +} + +static int __tdx_enable(void) +{ + int ret; + + /* + * Initializing the TDX module requires doing SEAMCALL on all + * boot-time present CPUs. For simplicity temporarily disable + * CPU hotplug to prevent any CPU from going offline during + * the initialization. + */ + cpus_read_lock(); + + /* + * Check whether all boot-time present CPUs are online and + * return early with a message so the user can be aware. + * + * Note a non-buggy BIOS should never support physical (ACPI) + * CPU hotplug when TDX is enabled, and all boot-time present + * CPU should be enabled in MADT, so there should be no + * disabled_cpus and num_processors won't change at runtime + * either. + */ + if (disabled_cpus || num_online_cpus() != num_processors) { + pr_err("Unable to initialize the TDX module when there's offline CPU(s).\n"); + ret = -EINVAL; + goto out; + } + + ret = init_tdx_module(); + if (ret == -ENODEV) { + pr_info("TDX module is not loaded.\n"); + tdx_module_status = TDX_MODULE_NONE; + goto out; + } + + /* + * Shut down the TDX module in case of any error during the + * initialization process. It's meaningless to leave the TDX + * module in any middle state of the initialization process. + * + * Shutting down the module also requires doing SEAMCALL on all + * MADT-enabled CPUs. Do it while CPU hotplug is disabled. + * + * Return all errors during the initialization as -EFAULT as the + * module is always shut down. + */ + if (ret) { + pr_info("Failed to initialize TDX module. Shut it down.\n"); + shutdown_tdx_module(); + tdx_module_status = TDX_MODULE_SHUTDOWN; + ret = -EFAULT; + goto out; + } + + pr_info("TDX module initialized.\n"); + tdx_module_status = TDX_MODULE_INITIALIZED; +out: + cpus_read_unlock(); + + return ret; +} + +/** + * tdx_enable - Enable TDX by initializing the TDX module + * + * Caller to make sure all CPUs are online and in VMX operation before + * calling this function. CPU hotplug is temporarily disabled internally + * to prevent any cpu from going offline. + * + * This function can be called in parallel by multiple callers. + * + * Return: + * + * * 0: The TDX module has been successfully initialized. + * * -ENODEV: The TDX module is not loaded, or TDX is not supported. + * * -EINVAL: The TDX module cannot be initialized due to certain + * conditions are not met (i.e. when not all MADT-enabled + * CPUs are not online). + * * -EFAULT: Other internal fatal errors, or the TDX module is in + * shutdown mode due to it failed to initialize in previous + * attempts. + */ +int tdx_enable(void) +{ + int ret; + + if (!platform_tdx_enabled()) + return -ENODEV; + + mutex_lock(&tdx_module_lock); + + switch (tdx_module_status) { + case TDX_MODULE_UNKNOWN: + ret = __tdx_enable(); + break; + case TDX_MODULE_NONE: + ret = -ENODEV; + break; + case TDX_MODULE_INITIALIZED: + ret = 0; + break; + default: + WARN_ON_ONCE(tdx_module_status != TDX_MODULE_SHUTDOWN); + ret = -EFAULT; + break; + } + + mutex_unlock(&tdx_module_lock); + + return ret; +} +EXPORT_SYMBOL_GPL(tdx_enable);