mbox series

[v2,00/58] TDX QEMU support

Message ID 20230818095041.1973309-1-xiaoyao.li@intel.com (mailing list archive)
Headers show
Series TDX QEMU support | expand

Message

Xiaoyao Li Aug. 18, 2023, 9:49 a.m. UTC
This is v2 series of adding TDX suppot in QEMU.

This patch series aims to enable TDX support to allow creating and booting a
TD (TDX VM) with QEMU. It needs to work with corresponding KVM v15 patch [1].
TDX related documents can be found in [2].

This series is based on QEMU gmem implemntation, which is posted at [3].
And This series is also available in github:
https://github.com/intel/qemu-tdx/tree/tdx-qemu-upstream-v2

This version aims to update the TDX QEMU side to match with latest TDX
KVM side implementation, which expose gmem for private memory. This
version is not targeted as the final version because how to support KVM
gmem in QEMU is not finalized yet. Though, any review comment is
welcomed.


[1] KVM TDX basic feature support v15
https://lore.kernel.org/kvm/cover.1690322424.git.isaku.yamahata@intel.com/

[2] https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html

[3] https://lore.kernel.org/all/20230731162201.271114-1-xiaoyao.li@intel.com/


== Limitation and future work ==
- Readonly memslot

  TDX only support readonly (write protection) memslot for shared memory, but
  not for private memory. For simplicity, just mark readonly memslot not
  supported entirely for TDX.

- CPU model

  We cannot create a TD with arbitrary CPU model like what for non-TDX VMs,
  because only a subset of features can be configured for TD.

  - It's recommended to use '-cpu host' to create TD;
  - '+feature/-feature' might not work as expected;

  future work: To introduce specific CPU model for TDs and enhance +/-features
               for TDs.

- gdb suppport

  gdb support to debug a TD of off-debug mode is future work.


== Change history ==
Changes from v1:
[v1] https://lore.kernel.org/qemu-devel/20220802074750.2581308-1-xiaoyao.li@intel.com/

- Switch to KVM gmem interface for private memory;
- Add TDVMCALL and its sub leaves support;
- mark LMCE as unsupported for TD VM;
- bing back the support of mrconfigid/mrowner/mrownerconfig;
- update documentation;

Changes from RFC v4:
[RFC v4] https://lore.kernel.org/qemu-devel/20220512031803.3315890-1-xiaoyao.li@intel.com/

- Add 3 more patches(9, 10, 11) to improve the tdx_get_supported_cpuid();
- make attributes of object tdx-guest not settable by user;
- improve get_tdx_capabilities() by using a known starting value and
  limiting the loop with a known size;
- clarify why isa.bios needs to be skipped;
- remove the MMIO hob setup since OVMF sets them up itself;

Changes from RFC v3:
[RFC v3] https://lore.kernel.org/qemu-devel/20220317135913.2166202-1-xiaoyao.li@intel.com/

- Load TDVF with -bios interface;
- Adapt to KVM API changes;
	- KVM_TDX_CAPABILITIES changes back to KVM-scope;
	- struct kvm_tdx_init_vm changes;
- Define TDX_SUPPORTED_KVM_FEATURES;
- Drop the patch of introducing property sept-ve-disable since it's not
  public yet;
- some misc cleanups

Changes from RFC v2:
[RFC v2] https://lore.kernel.org/qemu-devel/cover.1625704980.git.isaku.yamahata@intel.com/

- Get vm-type from confidential-guest-support object type;
- Drop machine_init_done_late_notifiers;
- Refactor tdx_ioctl implementation;
- re-use existing pflash interface to load TDVF (i.e., OVMF binaries);
- introduce new date structure to track memory type instead of changing
  e820 table;
- Force smm to off for TDX VM;
- Drop the patches that suppress level-trigger/SMI/INIT/SIPI since KVM
  will ingore them;
- Add documentation;

Changes from RFC v1:
[RFC v1] https://lore.kernel.org/qemu-devel/cover.1613188118.git.isaku.yamahata@intel.com/

- suppress level trigger/SMI/INIT/SIPI related to IOAPIC.
- add VM attribute sha384 to TD measurement.
- guest TSC Hz specification



Chao Peng (1):
  i386/tdx: register TDVF as private memory

Chenyi Qiang (2):
  i386/tdx: register the fd read callback with the main loop to read the
    quote data
  i386/tdx: setup a timer for the qio channel

Isaku Yamahata (14):
  i386/tdx: Make sept_ve_disable set by default
  qom: implement property helper for sha384
  i386/tdx: Allows mrconfigid/mrowner/mrownerconfig for TDX_INIT_VM
  i386/tdx: Create kvm gmem for TD
  kvm/tdx: Don't complain when converting vMMIO region to shared
  kvm/tdx: Ignore memory conversion to shared of unassigned region
  i386/tdvf: Introduce function to parse TDVF metadata
  i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION
  i386/tdx: handle TDG.VP.VMCALL<SetupEventNotifyInterrupt>
  i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  i386/tdx: handle TDG.VP.VMCALL<MapGPA> hypercall
  i386/tdx: Limit the range size for MapGPA
  hw/i386: add option to forcibly report edge trigger in acpi tables
  i386/tdx: Don't synchronize guest tsc for TDs

Sean Christopherson (2):
  i386/kvm: Move architectural CPUID leaf generation to separate helper
  i386/tdx: Don't get/put guest state for TDX VMs

Xiaoyao Li (39):
  *** HACK *** linux-headers: Update headers to pull in TDX API changes
  i386: Introduce tdx-guest object
  target/i386: Parse TDX vm type
  target/i386: Introduce kvm_confidential_guest_init()
  i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
  i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
  i386/tdx: Adjust the supported CPUID based on TDX restrictions
  i386/tdx: Update tdx_cpuid_lookup[].tdx_fixed0/1 by
    tdx_caps.cpuid_config[]
  i386/tdx: Integrate tdx_caps->xfam_fixed0/1 into tdx_cpuid_lookup
  i386/tdx: Integrate tdx_caps->attrs_fixed0/1 to tdx_cpuid_lookup
  kvm: Introduce kvm_arch_pre_create_vcpu()
  i386/tdx: Initialize TDX before creating TD vcpus
  i386/tdx: Add property sept-ve-disable for tdx-guest object
  i386/tdx: Wire CPU features up with attributes of TD guest
  i386/tdx: Validate TD attributes
  i386/tdx: Implement user specified tsc frequency
  i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
  i386/tdx: Make memory type private by default
  i386/tdx: Parse TDVF metadata for TDX VM
  i386/tdx: Skip BIOS shadowing setup
  i386/tdx: Don't initialize pc.rom for TDX VMs
  i386/tdx: Track mem_ptr for each firmware entry of TDVF
  i386/tdx: Track RAM entries for TDX VM
  headers: Add definitions from UEFI spec for volumes, resources, etc...
  i386/tdx: Setup the TD HOB list
  memory: Introduce memory_region_init_ram_gmem()
  i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu
  i386/tdx: Finalize TDX VM
  i386/tdx: Handle TDG.VP.VMCALL<REPORT_FATAL_ERROR>
  i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility
  i386/tdx: Disable SMM for TDX VMs
  i386/tdx: Disable PIC for TDX VMs
  i386/tdx: Don't allow system reset for TDX VMs
  i386/tdx: LMCE is not supported for TDX
  hw/i386: add eoi_intercept_unsupported member to X86MachineState
  i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
  i386/tdx: Skip kvm_put_apicbase() for TDs
  docs: Add TDX documentation

 accel/kvm/kvm-all.c                        |   55 +-
 configs/devices/i386-softmmu/default.mak   |    1 +
 docs/system/confidential-guest-support.rst |    1 +
 docs/system/i386/tdx.rst                   |  114 ++
 docs/system/target-i386.rst                |    1 +
 hw/i386/Kconfig                            |    6 +
 hw/i386/acpi-build.c                       |   99 +-
 hw/i386/acpi-common.c                      |   50 +-
 hw/i386/meson.build                        |    1 +
 hw/i386/pc.c                               |   21 +-
 hw/i386/pc_sysfw.c                         |    7 +
 hw/i386/tdvf-hob.c                         |  147 ++
 hw/i386/tdvf-hob.h                         |   24 +
 hw/i386/tdvf.c                             |  200 +++
 hw/i386/x86.c                              |   38 +-
 include/exec/memory.h                      |    6 +
 include/hw/i386/tdvf.h                     |   58 +
 include/hw/i386/x86.h                      |    1 +
 include/qom/object.h                       |   17 +
 include/standard-headers/uefi/uefi.h       |  198 +++
 include/sysemu/kvm.h                       |    3 +
 linux-headers/asm-x86/kvm.h                |   90 ++
 linux-headers/linux/kvm.h                  |   87 ++
 qapi/qom.json                              |   26 +
 qapi/run-state.json                        |   17 +-
 qom/object.c                               |   76 +
 softmmu/memory.c                           |   52 +
 softmmu/runstate.c                         |   49 +
 target/i386/cpu-internal.h                 |    9 +
 target/i386/cpu.c                          |   12 -
 target/i386/cpu.h                          |   21 +
 target/i386/kvm/kvm-cpu.c                  |    5 +
 target/i386/kvm/kvm.c                      |  586 ++++----
 target/i386/kvm/kvm_i386.h                 |    5 +
 target/i386/kvm/meson.build                |    2 +
 target/i386/kvm/tdx-stub.c                 |   22 +
 target/i386/kvm/tdx.c                      | 1543 ++++++++++++++++++++
 target/i386/kvm/tdx.h                      |   73 +
 target/i386/sev.c                          |    1 -
 target/i386/sev.h                          |    2 +
 40 files changed, 3382 insertions(+), 344 deletions(-)
 create mode 100644 docs/system/i386/tdx.rst
 create mode 100644 hw/i386/tdvf-hob.c
 create mode 100644 hw/i386/tdvf-hob.h
 create mode 100644 hw/i386/tdvf.c
 create mode 100644 include/hw/i386/tdvf.h
 create mode 100644 include/standard-headers/uefi/uefi.h
 create mode 100644 target/i386/kvm/tdx-stub.c
 create mode 100644 target/i386/kvm/tdx.c
 create mode 100644 target/i386/kvm/tdx.h

Comments

Chenyi Qiang Aug. 24, 2023, 7:21 a.m. UTC | #1
On 8/18/2023 5:50 PM, Xiaoyao Li wrote:
> From: Chenyi Qiang <chenyi.qiang@intel.com>
> 
> To avoid no response from QGS server, setup a timer for the transaction. If
> timeout, make it an error and interrupt guest. Define the threshold of time
> to 30s at present, maybe change to other value if not appropriate.
> 
> Extract the common cleanup code to make it more clear.
> 
> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  target/i386/kvm/tdx.c | 151 ++++++++++++++++++++++++------------------
>  1 file changed, 85 insertions(+), 66 deletions(-)
> 
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 3cb2163a0335..fa658ce1f2e4 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -1002,6 +1002,7 @@ struct tdx_get_quote_task {
>      struct tdx_get_quote_header hdr;
>      int event_notify_interrupt;
>      QIOChannelSocket *ioc;
> +    QEMUTimer timer;
>  };
>  
>  struct x86_msi {
> @@ -1084,13 +1085,48 @@ static void tdx_td_notify(struct tdx_get_quote_task *t)
>      }
>  }
>  
> +static void tdx_getquote_task_cleanup(struct tdx_get_quote_task *t, bool outlen_overflow)
> +{
> +    MachineState *ms;
> +    TdxGuest *tdx;
> +
> +    if (t->hdr.error_code != cpu_to_le64(TDX_VP_GET_QUOTE_SUCCESS) && !outlen_overflow) {
> +        t->hdr.out_len = cpu_to_le32(0);
> +    }
> +
> +    /* Publish the response contents before marking this request completed. */
> +    smp_wmb();
> +    if (address_space_write(
> +            &address_space_memory, t->gpa,
> +            MEMTXATTRS_UNSPECIFIED, &t->hdr, sizeof(t->hdr)) != MEMTX_OK) {
> +        error_report("TDX: failed to update GetQuote header.");
> +    }
> +    tdx_td_notify(t);
> +
> +    if (t->ioc->fd > 0) {
> +        qemu_set_fd_handler(t->ioc->fd, NULL, NULL, NULL);
> +    }
> +    qio_channel_close(QIO_CHANNEL(t->ioc), NULL);
> +    object_unref(OBJECT(t->ioc));
> +    timer_del(&t->timer);

Xiaoyao, I guess you missed a bug fix patch here as t->timer could be
uninitialized and then timer_del() will cause segv.

> +    g_free(t->out_data);
> +    g_free(t);
> +
> +    /* Maintain the number of in-flight requests. */
> +    ms = MACHINE(qdev_get_machine());
> +    tdx = TDX_GUEST(ms->cgs);
> +    qemu_mutex_lock(&tdx->lock);
> +    tdx->quote_generation_num--;
> +    qemu_mutex_unlock(&tdx->lock);
> +}
> +
Xiaoyao Li Aug. 24, 2023, 8:34 a.m. UTC | #2
On 8/24/2023 3:21 PM, Chenyi Qiang wrote:
> 
> 
> On 8/18/2023 5:50 PM, Xiaoyao Li wrote:
>> From: Chenyi Qiang <chenyi.qiang@intel.com>
>>
>> To avoid no response from QGS server, setup a timer for the transaction. If
>> timeout, make it an error and interrupt guest. Define the threshold of time
>> to 30s at present, maybe change to other value if not appropriate.
>>
>> Extract the common cleanup code to make it more clear.
>>
>> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   target/i386/kvm/tdx.c | 151 ++++++++++++++++++++++++------------------
>>   1 file changed, 85 insertions(+), 66 deletions(-)
>>
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index 3cb2163a0335..fa658ce1f2e4 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -1002,6 +1002,7 @@ struct tdx_get_quote_task {
>>       struct tdx_get_quote_header hdr;
>>       int event_notify_interrupt;
>>       QIOChannelSocket *ioc;
>> +    QEMUTimer timer;
>>   };
>>   
>>   struct x86_msi {
>> @@ -1084,13 +1085,48 @@ static void tdx_td_notify(struct tdx_get_quote_task *t)
>>       }
>>   }
>>   
>> +static void tdx_getquote_task_cleanup(struct tdx_get_quote_task *t, bool outlen_overflow)
>> +{
>> +    MachineState *ms;
>> +    TdxGuest *tdx;
>> +
>> +    if (t->hdr.error_code != cpu_to_le64(TDX_VP_GET_QUOTE_SUCCESS) && !outlen_overflow) {
>> +        t->hdr.out_len = cpu_to_le32(0);
>> +    }
>> +
>> +    /* Publish the response contents before marking this request completed. */
>> +    smp_wmb();
>> +    if (address_space_write(
>> +            &address_space_memory, t->gpa,
>> +            MEMTXATTRS_UNSPECIFIED, &t->hdr, sizeof(t->hdr)) != MEMTX_OK) {
>> +        error_report("TDX: failed to update GetQuote header.");
>> +    }
>> +    tdx_td_notify(t);
>> +
>> +    if (t->ioc->fd > 0) {
>> +        qemu_set_fd_handler(t->ioc->fd, NULL, NULL, NULL);
>> +    }
>> +    qio_channel_close(QIO_CHANNEL(t->ioc), NULL);
>> +    object_unref(OBJECT(t->ioc));
>> +    timer_del(&t->timer);
> 
> Xiaoyao, I guess you missed a bug fix patch here as t->timer could be
> uninitialized and then timer_del() will cause segv.

Thanks for the reminding.
I'll update this patch to include the fix.

Thanks,
-Xiaoyao

>> +    g_free(t->out_data);
>> +    g_free(t);
>> +
>> +    /* Maintain the number of in-flight requests. */
>> +    ms = MACHINE(qdev_get_machine());
>> +    tdx = TDX_GUEST(ms->cgs);
>> +    qemu_mutex_lock(&tdx->lock);
>> +    tdx->quote_generation_num--;
>> +    qemu_mutex_unlock(&tdx->lock);
>> +}
>> +
>