From patchwork Tue Mar 9 01:38:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Kai" X-Patchwork-Id: 12123717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC693C433E6 for ; Tue, 9 Mar 2021 01:39:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C6C3565111 for ; Tue, 9 Mar 2021 01:39:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230234AbhCIBjI (ORCPT ); Mon, 8 Mar 2021 20:39:08 -0500 Received: from mga05.intel.com ([192.55.52.43]:12739 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230252AbhCIBjE (ORCPT ); Mon, 8 Mar 2021 20:39:04 -0500 IronPort-SDR: qZyKRUAaC+RPF/DrTNZMWXyaUYZVHj/UYz7OHZQ3hdij/iSRJIVV7M6WLpBP0g7M/twCrxp9w/ l8w1U9t96bAQ== X-IronPort-AV: E=McAfee;i="6000,8403,9917"; a="273166122" X-IronPort-AV: E=Sophos;i="5.81,233,1610438400"; d="scan'208";a="273166122" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2021 17:39:02 -0800 IronPort-SDR: LQxC6Z98sV7ixDxOqMr/GPTQiFisZHrtBQSudWfurmzYxTCSkXlVEFsaFdhqsqmTwWQ/+9BIry njcQ0nY5OdZw== X-IronPort-AV: E=Sophos;i="5.81,233,1610438400"; d="scan'208";a="509117184" Received: from kzliu-mobl.amr.corp.intel.com (HELO khuang2-desk.gar.corp.intel.com) ([10.252.128.38]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2021 17:38:57 -0800 From: Kai Huang To: kvm@vger.kernel.org, x86@kernel.org, linux-sgx@vger.kernel.org Cc: linux-kernel@vger.kernel.org, seanjc@google.com, jarkko@kernel.org, luto@kernel.org, dave.hansen@intel.com, rick.p.edgecombe@intel.com, haitao.huang@intel.com, pbonzini@redhat.com, bp@alien8.de, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, jethro@fortanix.com, b.thiel@posteo.de, jmattson@google.com, joro@8bytes.org, vkuznets@redhat.com, wanpengli@tencent.com, corbet@lwn.net Subject: [PATCH v2 00/25] KVM SGX virtualization support Date: Tue, 9 Mar 2021 14:38:49 +1300 Message-Id: X-Mailer: git-send-email 2.29.2 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org This series adds KVM SGX virtualization support. The first 14 patches starting with x86/sgx or x86/cpu.. are necessary changes to x86 and SGX core/driver to support KVM SGX virtualization, while the rest are patches to KVM subsystem. This series is based against latest upstream kernel master branch. You can also get the code from upstream branch of kvm-sgx repo on github: https://github.com/intel/kvm-sgx.git upstream It also requires Qemu changes to create VM with SGX support. You can find Qemu repo here: https://github.com/intel/qemu-sgx.git upstream Please refer to README.md of above qemu-sgx repo for detail on how to create guest with SGX support. At meantime, for your quick reference you can use below command to create SGX guest: #qemu-system-x86_64 -smp 4 -m 2G -drive file=,if=virtio \ -cpu host,+sgx_provisionkey \ -sgx-epc id=epc1,memdev=mem1 \ -object memory-backend-epc,id=mem1,size=64M,prealloc Please note that the SGX relevant part is: -cpu host,+sgx_provisionkey \ -sgx-epc id=epc1,memdev=mem1 \ -object memory-backend-epc,id=mem1,size=64M,prealloc And you can change other parameters of your qemu command based on your needs. ========= Changelog: (Changelog here is for global changes. Please see each patch's changelog for changes made to specific patch.) v1->v2: - No big change in design, structural of patch series, etc. - Addressed Boris's comments regarding to suppressing both SGX1 and SGX2 in /proc/cpuinfo, and improvement in feat_ctl.c when enabling SGX (patch 2 and 6). - Addressed Sean's comments for both x86 part patches and KVM patches (patch 3, 5, 9, 12, 19, 21). - Addressed Dave's comments in RFC v6 series (patch 13). RFC->v1: - Refined patch (x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()) to print error msg that EPC page is leaked when EREMOVE failed, requested by Dave. - Changelog history of all RFC series is removed in both this cover letter and each individual patch, since majority of x86 part patches already got Acked-by from Dave and Jarkko. And the changelogs are not quite useful from my perspective. ========= KVM SGX virtualization Overview - Virtual EPC SGX enclave memory is special and is reserved specifically for enclave use. In bare-metal SGX enclaves, the kernel allocates enclave pages, copies data into the pages with privileged instructions, then allows the enclave to start. In this scenario, only initialized pages already assigned to an enclave are mapped to userspace. In virtualized environments, the hypervisor still needs to do the physical enclave page allocation. The guest kernel is responsible for the data copying (among other things). This means that the job of starting an enclave is now split between hypervisor and guest. This series introduces a new misc device: /dev/sgx_vepc. This device allows the host to map *uninitialized* enclave memory into userspace, which can then be passed into a guest. While it might be *possible* to start a host-side enclave with /dev/sgx_enclave and pass its memory into a guest, it would be wasteful and convoluted. Implement the *raw* EPC allocation in the x86 core-SGX subsystem via /dev/sgx_vepc rather than in KVM. Doing so has two major advantages: - Does not require changes to KVM's uAPI, e.g. EPC gets handled as just another memory backend for guests. - EPC management is wholly contained in the SGX subsystem, e.g. SGX does not have to export any symbols, changes to reclaim flows don't need to be routed through KVM, SGX's dirty laundry doesn't have to get aired out for the world to see, and so on and so forth. The virtual EPC pages allocated to guests are currently not reclaimable. Reclaiming EPC page used by enclave requires a special reclaim mechanism separate from normal page reclaim, and that mechanism is not supported for virutal EPC pages. Due to the complications of handling reclaim conflicts between guest and host, reclaiming virtual EPC pages is significantly more complex than basic support for SGX virtualization. - Support SGX virtualization without SGX Flexible Launch Control SGX hardware supports two "launch control" modes to limit which enclaves can run. In the "locked" mode, the hardware prevents enclaves from running unless they are blessed by a third party. In the unlocked mode, the kernel is in full control of which enclaves can run. The bare-metal SGX code refuses to launch enclaves unless it is in the unlocked mode. This sgx_virt_epc driver does not have such a restriction. This allows guests which are OK with the locked mode to use SGX, even if the host kernel refuses to. - Support exposing SGX2 Due to the same reason above, SGX2 feature detection is added to core SGX code to allow KVM to expose SGX2 to guest, even currently SGX driver doesn't support SGX2, because SGX2 can work just fine in guest w/o any interaction to host SGX driver. - Restricit SGX guest access to provisioning key To grant guest being able to fully use SGX, guest needs to be able to access provisioning key. The provisioning key is sensitive, and accessing to it should be restricted. In bare-metal driver, allowing enclave to access provisioning key is restricted by being able to open /dev/sgx_provision. Add a new KVM_CAP_SGX_ATTRIBUTE to KVM uAPI to extend above mechanism to KVM guests as well. When userspace hypervisor creates a new VM, the new cap is only added to VM when userspace hypervisior is able to open /dev/sgx_provision, following the same role as in bare-metal driver. KVM then traps ECREATE from guest, and only allows ECREATE with provisioning key bit to run when guest supports KVM_CAP_SGX_ATTRIBUTE. Jarkko Sakkinen (1): x86/sgx: Wipe out EREMOVE from sgx_free_epc_page() Kai Huang (3): x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs Sean Christopherson (21): x86/cpufeatures: Add SGX1 and SGX2 sub-features x86/sgx: Add SGX_CHILD_PRESENT hardware error code x86/sgx: Introduce virtual EPC for use by KVM guests x86/cpu/intel: Allow SGX virtualization without Launch Control support x86/sgx: Expose SGX architectural definitions to the kernel x86/sgx: Move ENCLS leaf definitions to sgx.h x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT) x86/sgx: Add encls_faulted() helper x86/sgx: Add helpers to expose ECREATE and EINIT to KVM x86/sgx: Move provisioning device creation out of SGX driver KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX) KVM: x86: Define new #PF SGX error code bit KVM: x86: Add support for reverse CPUID lookup of scattered features KVM: x86: Add reverse-CPUID lookup support for scattered SGX features KVM: VMX: Add basic handling of VM-Exit from SGX enclave KVM: VMX: Frame in ENCLS handler for SGX virtualization KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC) KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC KVM: x86: Add capability to grant VM access to privileged SGX attribute Documentation/virt/kvm/api.rst | 23 + arch/x86/Kconfig | 12 + arch/x86/include/asm/cpufeatures.h | 2 + arch/x86/include/asm/kvm_host.h | 5 + .../cpu/sgx/arch.h => include/asm/sgx.h} | 50 +- arch/x86/include/asm/vmx.h | 1 + arch/x86/include/uapi/asm/vmx.h | 1 + arch/x86/kernel/cpu/cpuid-deps.c | 3 + arch/x86/kernel/cpu/feat_ctl.c | 71 ++- arch/x86/kernel/cpu/scattered.c | 2 + arch/x86/kernel/cpu/sgx/Makefile | 1 + arch/x86/kernel/cpu/sgx/driver.c | 17 - arch/x86/kernel/cpu/sgx/encl.c | 29 +- arch/x86/kernel/cpu/sgx/encls.h | 30 +- arch/x86/kernel/cpu/sgx/ioctl.c | 23 +- arch/x86/kernel/cpu/sgx/main.c | 94 +++- arch/x86/kernel/cpu/sgx/sgx.h | 13 +- arch/x86/kernel/cpu/sgx/virt.c | 370 ++++++++++++++ arch/x86/kvm/Makefile | 2 + arch/x86/kvm/cpuid.c | 89 +++- arch/x86/kvm/cpuid.h | 50 +- arch/x86/kvm/vmx/nested.c | 28 +- arch/x86/kvm/vmx/nested.h | 5 + arch/x86/kvm/vmx/sgx.c | 481 ++++++++++++++++++ arch/x86/kvm/vmx/sgx.h | 34 ++ arch/x86/kvm/vmx/vmcs12.c | 1 + arch/x86/kvm/vmx/vmcs12.h | 4 +- arch/x86/kvm/vmx/vmx.c | 109 +++- arch/x86/kvm/vmx/vmx.h | 2 + arch/x86/kvm/x86.c | 23 + include/uapi/linux/kvm.h | 1 + tools/testing/selftests/sgx/defines.h | 2 +- 32 files changed, 1460 insertions(+), 118 deletions(-) rename arch/x86/{kernel/cpu/sgx/arch.h => include/asm/sgx.h} (89%) create mode 100644 arch/x86/kernel/cpu/sgx/virt.c create mode 100644 arch/x86/kvm/vmx/sgx.c create mode 100644 arch/x86/kvm/vmx/sgx.h