diff mbox series

[v9,04/18] x86/virt/tdx: Add skeleton to initialize TDX on demand

Message ID eb565156d84e35a846a886025513a712f2ac2f83.1676286526.git.kai.huang@intel.com (mailing list archive)
State New, archived
Headers show
Series TDX host kernel support | expand

Commit Message

Huang, Kai Feb. 13, 2023, 11:59 a.m. UTC
Before the TDX module can be used to create and run TDX guests, it must
be loaded and properly initialized.  The TDX module is expected to be
loaded by the BIOS, and to be initialized by the kernel.

TDX introduces a new CPU mode: Secure Arbitration Mode (SEAM).  The host
kernel communicates with the TDX module via a new SEAMCALL instruction.
The TDX module implements a set of SEAMCALL leaf functions to allow the
host kernel to initialize it.

The TDX module can be initialized only once in its lifetime.  Instead
of always initializing it at boot time, this implementation chooses an
"on demand" approach to initialize TDX until there is a real need (e.g
when requested by KVM).  This approach has below pros:

1) It avoids consuming the memory that must be allocated by kernel and
given to the TDX module as metadata (~1/256th of the TDX-usable memory),
and also saves the CPU cycles of initializing the TDX module (and the
metadata) when TDX is not used at all.

2) The TDX module design allows it to be updated while the system is
running.  The update procedure shares quite a few steps with this "on
demand" initialization mechanism.  The hope is that much of "on demand"
mechanism can be shared with a future "update" mechanism.  A boot-time
TDX module implementation would not be able to share much code with the
update mechanism.

3) Loading the TDX module requires VMX to be enabled.  Currently, only
the kernel KVM code mucks with VMX enabling.  If the TDX module were to
be initialized separately from KVM (like at boot), the boot code would
need to be taught how to muck with VMX enabling and KVM would need to be
taught how to cope with that.  Making KVM itself responsible for TDX
initialization lets the rest of the kernel stay blissfully unaware of
VMX.

Add a placeholder tdx_enable() to initialize the TDX module on demand.
The TODO list will be pared down as functionality is added.

Use a state machine protected by mutex to make sure the initialization
will only be done once, as tdx_enable() can be called multiple times
(i.e. KVM module can be reloaded) and be called concurrently by other
kernel components in the future.

Also introduce a local tdx.h to hold all TDX architectural and kernel
defined structures and declarations used by module initialization.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
---

v8 -> v9:
 - Removed detailed TODO list in the changelog (Dave).
 - Added back steps to do module global initialization and per-cpu
   initialization in the TODO list comment.
 - Moved the 'enum tdx_module_status_t' from tdx.c to local tdx.h

v7 -> v8:
 - Refined changelog (Dave).
 - Removed "all BIOS-enabled cpus" related code (Peter/Thomas/Dave).
 - Add a "TODO list" comment in init_tdx_module() to list all steps of
   initializing the TDX Module to tell the story (Dave).
 - Made tdx_enable() unverisally return -EINVAL, and removed nonsense
   comments (Dave).
 - Simplified __tdx_enable() to only handle success or failure.
 - TDX_MODULE_SHUTDOWN -> TDX_MODULE_ERROR
 - Removed TDX_MODULE_NONE (not loaded) as it is not necessary.
 - Improved comments (Dave).
 - Pointed out 'tdx_module_status' is software thing (Dave).

v6 -> v7:
 - No change.

v5 -> v6:
 - Added code to set status to TDX_MODULE_NONE if TDX module is not
   loaded (Chao)
 - Added Chao's Reviewed-by.
 - Improved comments around cpus_read_lock().

- v3->v5 (no feedback on v4):
 - Removed the check that SEAMRR and TDX KeyID have been detected on
   all present cpus.
 - Removed tdx_detect().
 - Added num_online_cpus() to MADT-enabled CPUs check within the CPU
   hotplug lock and return early with error message.
 - Improved dmesg printing for TDX module detection and initialization.

---
 arch/x86/include/asm/tdx.h  |  2 +
 arch/x86/virt/vmx/tdx/tdx.c | 89 +++++++++++++++++++++++++++++++++++++
 arch/x86/virt/vmx/tdx/tdx.h | 12 +++++
 3 files changed, 103 insertions(+)
 create mode 100644 arch/x86/virt/vmx/tdx/tdx.h

Comments

Peter Zijlstra Feb. 14, 2023, 12:46 p.m. UTC | #1
On Tue, Feb 14, 2023 at 12:59:11AM +1300, Kai Huang wrote:
> Use a state machine protected by mutex to make sure the initialization
> will only be done once, as tdx_enable() can be called multiple times
> (i.e. KVM module can be reloaded) and be called concurrently by other
> kernel components in the future.

I still object to doing tdx_enable() at kvm module load.

kvm.ko gets loaded unconditionally on boot, even if I then never use
kvm.

This stuff needs to be done when an actual VM is created, not before.
Dave Hansen Feb. 14, 2023, 5:23 p.m. UTC | #2
On 2/14/23 04:46, Peter Zijlstra wrote:
> On Tue, Feb 14, 2023 at 12:59:11AM +1300, Kai Huang wrote:
>> Use a state machine protected by mutex to make sure the initialization
>> will only be done once, as tdx_enable() can be called multiple times
>> (i.e. KVM module can be reloaded) and be called concurrently by other
>> kernel components in the future.
> I still object to doing tdx_enable() at kvm module load.
> 
> kvm.ko gets loaded unconditionally on boot, even if I then never use
> kvm.
> 
> This stuff needs to be done when an actual VM is created, not before.

The actually implementation of this is hidden over in the KVM side of
this.  But, tdx_enable() and all of this jazz should not be called on
kvm.ko load.  It'll happen when the KVM tries to start the first TDX VM.

I think what Kai was thinking of was *this* sequence:

 1. insmod kvm.ko
 2. Start a TDX guest, tdx_enable() gets run
 3. rmmod kvm
 4. insmod kvm.ko (again)
 5. Start another TDX guest, run tdx_enable() (again)

The rmmod/insmod pair is what triggers the second call of tdx_enable().
Huang, Kai Feb. 14, 2023, 9:08 p.m. UTC | #3
On Tue, 2023-02-14 at 09:23 -0800, Dave Hansen wrote:
> On 2/14/23 04:46, Peter Zijlstra wrote:
> > On Tue, Feb 14, 2023 at 12:59:11AM +1300, Kai Huang wrote:
> > > Use a state machine protected by mutex to make sure the initialization
> > > will only be done once, as tdx_enable() can be called multiple times
> > > (i.e. KVM module can be reloaded) and be called concurrently by other
> > > kernel components in the future.
> > I still object to doing tdx_enable() at kvm module load.
> > 
> > kvm.ko gets loaded unconditionally on boot, even if I then never use
> > kvm.
> > 
> > This stuff needs to be done when an actual VM is created, not before.
> 
> The actually implementation of this is hidden over in the KVM side of
> this.  But, tdx_enable() and all of this jazz should not be called on
> kvm.ko load.  It'll happen when the KVM tries to start the first TDX VM.
> 
> I think what Kai was thinking of was *this* sequence:
> 
>  1. insmod kvm.ko
>  2. Start a TDX guest, tdx_enable() gets run
>  3. rmmod kvm
>  4. insmod kvm.ko (again)
>  5. Start another TDX guest, run tdx_enable() (again)
> 
> The rmmod/insmod pair is what triggers the second call of tdx_enable().

Yes.  The point is tdx_enable() can get called multi times.

We can discuss more when to enable TDX at KVM side, and I don't want to speak
for KVM maintainers, but this is actually not that relevant to this series.

In the changelog, I just said: 

"...initialize TDX until there is a real need (e.g when requested by KVM)".  

I didn't say exactly when KVM will call this.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 4dfe2e794411..4a3ee64c1ca7 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -97,8 +97,10 @@  static inline long tdx_kvm_hypercall(unsigned int nr, unsigned long p1,
 
 #ifdef CONFIG_INTEL_TDX_HOST
 bool platform_tdx_enabled(void);
+int tdx_enable(void);
 #else	/* !CONFIG_INTEL_TDX_HOST */
 static inline bool platform_tdx_enabled(void) { return false; }
+static inline int tdx_enable(void)  { return -EINVAL; }
 #endif	/* CONFIG_INTEL_TDX_HOST */
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index a600b5d0879d..f5a20d56097c 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -12,14 +12,20 @@ 
 #include <linux/init.h>
 #include <linux/errno.h>
 #include <linux/printk.h>
+#include <linux/mutex.h>
 #include <asm/msr-index.h>
 #include <asm/msr.h>
 #include <asm/tdx.h>
+#include "tdx.h"
 
 static u32 tdx_global_keyid __ro_after_init;
 static u32 tdx_guest_keyid_start __ro_after_init;
 static u32 tdx_nr_guest_keyids __ro_after_init;
 
+static enum tdx_module_status_t tdx_module_status;
+/* Prevent concurrent attempts on TDX module initialization */
+static DEFINE_MUTEX(tdx_module_lock);
+
 /*
  * Use tdx_global_keyid to indicate that TDX is uninitialized.
  * This is used in TDX initialization error paths to take it from
@@ -103,3 +109,86 @@  bool platform_tdx_enabled(void)
 {
 	return !!tdx_global_keyid;
 }
+
+static int init_tdx_module(void)
+{
+	/*
+	 * TODO:
+	 *
+	 *  - TDX module global initialization.
+	 *  - TDX module per-cpu initialization.
+	 *  - Get TDX module information and TDX-capable memory regions.
+	 *  - Build the list of TDX-usable memory regions.
+	 *  - Construct a list of "TD Memory Regions" (TDMRs) to cover
+	 *    all TDX-usable memory regions.
+	 *  - Configure the TDMRs and the global KeyID to the TDX module.
+	 *  - Configure the global KeyID on all packages.
+	 *  - Initialize all TDMRs.
+	 *
+	 *  Return error before all steps are done.
+	 */
+	return -EINVAL;
+}
+
+static int __tdx_enable(void)
+{
+	int ret;
+
+	ret = init_tdx_module();
+	if (ret) {
+		pr_err("initialization failed (%d)\n", ret);
+		tdx_module_status = TDX_MODULE_ERROR;
+		/*
+		 * Just return one universal error code.
+		 * For now the caller cannot recover anyway.
+		 */
+		return -EINVAL;
+	}
+
+	pr_info("TDX module initialized.\n");
+	tdx_module_status = TDX_MODULE_INITIALIZED;
+
+	return 0;
+}
+
+/**
+ * tdx_enable - Enable TDX to be ready to run TDX guests
+ *
+ * Initialize the TDX module to enable TDX.  After this function, the TDX
+ * module is ready to create and run TDX guests.
+ *
+ * This function assumes all online cpus are already in VMX operation.
+ * This function can be called in parallel by multiple callers.
+ *
+ * Return 0 if TDX is enabled successfully, otherwise error.
+ */
+int tdx_enable(void)
+{
+	int ret;
+
+	if (!platform_tdx_enabled()) {
+		pr_err_once("initialization failed: TDX is disabled.\n");
+		return -EINVAL;
+	}
+
+	mutex_lock(&tdx_module_lock);
+
+	switch (tdx_module_status) {
+	case TDX_MODULE_UNKNOWN:
+		ret = __tdx_enable();
+		break;
+	case TDX_MODULE_INITIALIZED:
+		/* Already initialized, great, tell the caller. */
+		ret = 0;
+		break;
+	default:
+		/* Failed to initialize in the previous attempts */
+		ret = -EINVAL;
+		break;
+	}
+
+	mutex_unlock(&tdx_module_lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(tdx_enable);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
new file mode 100644
index 000000000000..881cca276956
--- /dev/null
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -0,0 +1,12 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _X86_VIRT_TDX_H
+#define _X86_VIRT_TDX_H
+
+/* Kernel defined TDX module status during module initialization. */
+enum tdx_module_status_t {
+	TDX_MODULE_UNKNOWN,
+	TDX_MODULE_INITIALIZED,
+	TDX_MODULE_ERROR
+};
+
+#endif