From patchwork Wed Jun 22 09:37:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhiquan Li X-Patchwork-Id: 12890405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8750DC43334 for ; Wed, 22 Jun 2022 09:37:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236854AbiFVJhA (ORCPT ); Wed, 22 Jun 2022 05:37:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235328AbiFVJg6 (ORCPT ); Wed, 22 Jun 2022 05:36:58 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55FB43526C for ; Wed, 22 Jun 2022 02:36:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655890618; x=1687426618; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cVczxYOVbWktKTC3k1cfS17zBNVsdg+zCJlhBUjeNZU=; b=OegSKLuoOyIFFYAzLO0D8OELeUz3uiKW9jD1vRkmOs3ZcoNp2YS1HH1l APYOYIbiPpuDjacw09qUIiOV+PpgFLqMlyhCHL8G9SFb52Dj1Ij7cqKDK 1kM6TZhDyqqb/rg1ZGn8t7+P9orXQPh2XF8z5ceSjFQG9NC3XZTj8Jrml /hp4alQwlJn12CEVxaQXXJhCa71x37Yx87xupTa3EDkRRO0S1Ez0xnrO9 OY3A1T+OlFxMlv+sMibvbmSrU5Ssx7ymBqF9qlWbbETd9lWY9a3ngj6OE dImn10y1t9xoV5Q+/dZn49WHI7LWJqJnrmKZ3nrPPKKWGwl9sIbzfRQPN A==; X-IronPort-AV: E=McAfee;i="6400,9594,10385"; a="281100602" X-IronPort-AV: E=Sophos;i="5.92,212,1650956400"; d="scan'208";a="281100602" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Jun 2022 02:36:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,212,1650956400"; d="scan'208";a="677448852" Received: from zhiquan-linux-dev.bj.intel.com ([10.238.155.101]) by FMSMGA003.fm.intel.com with ESMTP; 22 Jun 2022 02:36:54 -0700 From: Zhiquan Li To: linux-sgx@vger.kernel.org, tony.luck@intel.com, jarkko@kernel.org, dave.hansen@linux.intel.com Cc: seanjc@google.com, kai.huang@intel.com, fan.du@intel.com, cathy.zhang@intel.com, zhiquan1.li@intel.com Subject: [PATCH v5 1/3] x86/sgx: Repurpose the owner field as the virtual address of virtual EPC page Date: Wed, 22 Jun 2022 17:37:03 +0800 Message-Id: <20220622093705.2891642-2-zhiquan1.li@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220622093705.2891642-1-zhiquan1.li@intel.com> References: <20220622093705.2891642-1-zhiquan1.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org When a page triggers a machine check, it only reports the physical address of EPC page. But in order to inject #MC into hypervisor, the virtual address is required. Then repurpose the "owner" field as the virtual address of the virtual EPC page so that arch_memory_failure() can easily retrieve it. Add a new EPC page flag - SGX_EPC_PAGE_KVM_GUEST to interpret the meaning of the field. Co-developed-by: Cathy Zhang Signed-off-by: Cathy Zhang Signed-off-by: Zhiquan Li Acked-by: Kai Huang --- Changes since V4: - Add Co-developed-by and Signed-off-by from Cathy Zhang, as she had fully discussed the flag name with Jarkko. Link: https://lore.kernel.org/all/df92395ade424401ac3c6322de568720@intel.com/ - Add Acked-by from Kai Huang Link: https://lore.kernel.org/linux-sgx/0676cd4e-d94b-e904-81ae-ca1c05d37070@intel.com/T/#mccfb11df30698dbd060f2b6f06383cda7f154ef3 Changes since V3: - Take the definition of EPC page flag SGX_EPC_PAGE_KVM_GUEST from Cathy Zhang's third patch of SGX rebootless recovery patch set but discard irrelevant portion, since it might need some time to re-forge and these are two different features. Link: https://lore.kernel.org/linux-sgx/41704e5d4c03b49fcda12e695595211d950cfb08.camel@kernel.org/T/#m9782d23496cacecb7da07a67daa79f4b322ae170 Changes since V2: - Rework the patch suggested by Jarkko. - Remove struct sgx_vepc_page and relevant code. - Remove new EPC page flag SGX_EPC_PAGE_IS_VEPC definition as it is duplicated to SGX_EPC_PAGE_KVM_GUEST. Link: https://lore.kernel.org/linux-sgx/eb95b32ecf3d44a695610cf7f2816785@intel.com/T/#u Changes since V1: - Add documentation suggested by Jarkko. --- arch/x86/kernel/cpu/sgx/sgx.h | 2 ++ arch/x86/kernel/cpu/sgx/virt.c | 4 +++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 0f17def9fe6f..b43582da1bcf 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -28,6 +28,8 @@ /* Pages on free list */ #define SGX_EPC_PAGE_IS_FREE BIT(1) +/* Pages allocated for KVM guest */ +#define SGX_EPC_PAGE_KVM_GUEST BIT(2) struct sgx_epc_page { unsigned int section; diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c index 6a77a14eee38..776ae5c1c032 100644 --- a/arch/x86/kernel/cpu/sgx/virt.c +++ b/arch/x86/kernel/cpu/sgx/virt.c @@ -46,10 +46,12 @@ static int __sgx_vepc_fault(struct sgx_vepc *vepc, if (epc_page) return 0; - epc_page = sgx_alloc_epc_page(vepc, false); + epc_page = sgx_alloc_epc_page((void *)addr, false); if (IS_ERR(epc_page)) return PTR_ERR(epc_page); + epc_page->flags |= SGX_EPC_PAGE_KVM_GUEST; + ret = xa_err(xa_store(&vepc->page_array, index, epc_page, GFP_KERNEL)); if (ret) goto err_free; From patchwork Wed Jun 22 09:37:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Zhiquan Li X-Patchwork-Id: 12890406 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 637D9C433EF for ; Wed, 22 Jun 2022 09:37:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238701AbiFVJhC (ORCPT ); Wed, 22 Jun 2022 05:37:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235328AbiFVJhC (ORCPT ); Wed, 22 Jun 2022 05:37:02 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 937BC3526C for ; Wed, 22 Jun 2022 02:37:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655890621; x=1687426621; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=y9C9ZFYorJ9n6WzhRgeGq+dOf1V5wlOyDpU0Ni3qz+E=; b=WFIjY9XG8WeOgzNBzBip7K5h/l37WghMX2N3L8vAIJ5UDR+n8gXjh/1L IchIn5CKSE8Hm9RxInTZGNZVOLx8qhikjI8JbTidk7m2j4q9NLpXNkFZ2 iggpnmhQiAPMlpV7i2RtJMfq4mefo3WPUgb1k5/1ZHwbcIUFjrDFxw6IU KFeykKVLuyjgSGER2BM/bF7BjxAc+hO6E+G/uo21s6P589LEQfVfHnUj3 3HNl5w9TF2ncjVpE6uPj83r8azfGDnne/V4TXW2UGlKZ84hfYWXi679jh 973md21MJgUKeo+Q3NTzkssM1d5ZPlI3GMPjNQwU7bhGJSBTIbY9ohvQL A==; X-IronPort-AV: E=McAfee;i="6400,9594,10385"; a="281100613" X-IronPort-AV: E=Sophos;i="5.92,212,1650956400"; d="scan'208";a="281100613" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Jun 2022 02:37:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,212,1650956400"; d="scan'208";a="677448860" Received: from zhiquan-linux-dev.bj.intel.com ([10.238.155.101]) by FMSMGA003.fm.intel.com with ESMTP; 22 Jun 2022 02:36:58 -0700 From: Zhiquan Li To: linux-sgx@vger.kernel.org, tony.luck@intel.com, jarkko@kernel.org, dave.hansen@linux.intel.com Cc: seanjc@google.com, kai.huang@intel.com, fan.du@intel.com, cathy.zhang@intel.com, zhiquan1.li@intel.com Subject: [PATCH v5 2/3] x86/sgx: Fine grained SGX MCA behavior for virtualization Date: Wed, 22 Jun 2022 17:37:04 +0800 Message-Id: <20220622093705.2891642-3-zhiquan1.li@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220622093705.2891642-1-zhiquan1.li@intel.com> References: <20220622093705.2891642-1-zhiquan1.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org When VM guest access a SGX EPC page with memory failure, current behavior will kill the guest, expected only kill the SGX application inside it. To fix it we send SIGBUS with code BUS_MCEERR_AR and some extra information for hypervisor to inject #MC information to guest, which is helpful in SGX case. The rest of things are guest side. Currently the hypervisor like Qemu already has mature facility to convert HVA to GPA and inject #MC to the guest OS. Unlike host enclaves, virtual EPC instance cannot be shared by multiple VMs. It is because how enclaves are created is totally up to the guest. Sharing virtual EPC instance will be very likely to unexpectedly break enclaves in all VMs. SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance being shared by multiple VMs via fork(). However KVM doesn't support running a VM across multiple mm structures, and the de facto userspace hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice this should not happen. Signed-off-by: Zhiquan Li Acked-by: Kai Huang Link: https://lore.kernel.org/linux-sgx/443cb425-009c-2784-56f4-5e707122de76@intel.com/T/#m1d1f4098f4fad78034e8706a60e4d79c119db407 --- Changes since V4: - Switch the order of the two variables so all of variables are in reverse Christmas style. - Do not initialize "ret" because it will be overridden by the return value of force_sig_mceerr() unconditionally. Changes since V2: - Retrieve virtual address from "owner" field of struct sgx_epc_page, instead of struct sgx_vepc_page. - Replace EPC page flag SGX_EPC_PAGE_IS_VEPC with SGX_EPC_PAGE_KVM_GUEST as they are duplicated. Changes since V1: - Add Acked-by from Kai Huang. - Add Kai’s excellent explanation regarding to why we no need to consider that one virtual EPC be shared by two guests. --- arch/x86/kernel/cpu/sgx/main.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index ab4ec54bbdd9..4507c2302348 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -715,6 +715,8 @@ int arch_memory_failure(unsigned long pfn, int flags) struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT); struct sgx_epc_section *section; struct sgx_numa_node *node; + unsigned long vaddr; + int ret; /* * mm/memory-failure.c calls this routine for all errors @@ -731,8 +733,26 @@ int arch_memory_failure(unsigned long pfn, int flags) * error. The signal may help the task understand why the * enclave is broken. */ - if (flags & MF_ACTION_REQUIRED) - force_sig(SIGBUS); + if (flags & MF_ACTION_REQUIRED) { + /* + * Provide extra info to the task so that it can make further + * decision but not simply kill it. This is quite useful for + * virtualization case. + */ + if (page->flags & SGX_EPC_PAGE_KVM_GUEST) { + /* + * The "owner" field is repurposed as the virtual address + * of virtual EPC page. + */ + vaddr = (unsigned long)page->owner & PAGE_MASK; + ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)vaddr, + PAGE_SHIFT); + if (ret < 0) + pr_err("Memory failure: Error sending signal to %s:%d: %d\n", + current->comm, current->pid, ret); + } else + force_sig(SIGBUS); + } section = &sgx_epc_sections[page->section]; node = section->node; From patchwork Wed Jun 22 09:37:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhiquan Li X-Patchwork-Id: 12890407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CFE2C433EF for ; Wed, 22 Jun 2022 09:37:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237523AbiFVJhI (ORCPT ); Wed, 22 Jun 2022 05:37:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235328AbiFVJhH (ORCPT ); Wed, 22 Jun 2022 05:37:07 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AA7B35AAD for ; Wed, 22 Jun 2022 02:37:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655890625; x=1687426625; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yhD4nyL4iV+PuCrN/tauAdoGztq7IpEcKdrueIrGy+Q=; b=knb/QfP70dljdWYwAJgkloXOQsXfGGXYLahSpUrl5jq4cCSJUAVuO3hT 84BfaAshnCvwfG0aXIqnRop4UL5oPei3X69H0odPTcjEy/Nr39f5KV9Pd iMnv29GC9Axc+2kbp9tkgK/1tfpyjsTDL2NOhqF+/i9t3JyGSQbrlsrmd Fk5NZHtP96yE1ucANgqZgircvQ/VIfZhjSGuInE71iUZiWccHtSLyidar FhxqYlX6uqbtSbJPE+vxuiHCoWChyJJRhHhvU0goTUaJ3nY+N/Iy2qCQt pxHQw8uVJMVbEx7PXUvTv6kFrd+WxJ4K+Q3KRh4Xfo7Hukr9dJ3/foQx7 Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10385"; a="281100622" X-IronPort-AV: E=Sophos;i="5.92,212,1650956400"; d="scan'208";a="281100622" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Jun 2022 02:37:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,212,1650956400"; d="scan'208";a="677448881" Received: from zhiquan-linux-dev.bj.intel.com ([10.238.155.101]) by FMSMGA003.fm.intel.com with ESMTP; 22 Jun 2022 02:37:02 -0700 From: Zhiquan Li To: linux-sgx@vger.kernel.org, tony.luck@intel.com, jarkko@kernel.org, dave.hansen@linux.intel.com Cc: seanjc@google.com, kai.huang@intel.com, fan.du@intel.com, cathy.zhang@intel.com, zhiquan1.li@intel.com Subject: [PATCH v5 3/3] x86/sgx: Fine grained SGX MCA behavior for normal case Date: Wed, 22 Jun 2022 17:37:05 +0800 Message-Id: <20220622093705.2891642-4-zhiquan1.li@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220622093705.2891642-1-zhiquan1.li@intel.com> References: <20220622093705.2891642-1-zhiquan1.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org When the application accesses a SGX EPC page with memory failure, the task will receive a SIGBUS signal without any extra info, unless the EPC page has SGX_EPC_PAGE_KVM_GUEST flag. However, in some cases, we only use SGX in sub-task and we don't expect the entire task group be killed due to a SGX EPC page for a sub-task has memory failure. To fix it, we extend the solution for normal case. That is, the SGX regular EPC page with memory failure will trigger a SIGBUS signal with code BUS_MCEERR_AR and additional info, so that the user has opportunity to make further decision. Suppose an enclave is shared by multiple processes, when an enclave page triggers a machine check, the enclave will be disabled so that it couldn't be entered again. Killing other processes with the same enclave mapped would perhaps be overkill, but they are going to find that the enclave is "dead" next time they try to use it. Thanks for Jarkko's head up and Tony's clarification on this point. Our intension is to provide additional info so that the application has more choices. Current behavior looks gently, and we don't want to change it. Signed-off-by: Zhiquan Li --- No changes since V4. Changes since V2: - Adapted the code since struct sgx_vepc_page was discarded. - Replace EPC page flag SGX_EPC_PAGE_IS_VEPC with SGX_EPC_PAGE_KVM_GUEST as they are duplicated. Changes since V1: - Add valuable information from Jarkko and Tony the into commit message. --- arch/x86/kernel/cpu/sgx/main.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 4507c2302348..7c55dcdb2b7c 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -739,12 +739,15 @@ int arch_memory_failure(unsigned long pfn, int flags) * decision but not simply kill it. This is quite useful for * virtualization case. */ - if (page->flags & SGX_EPC_PAGE_KVM_GUEST) { + if (page->owner) { /* * The "owner" field is repurposed as the virtual address * of virtual EPC page. */ - vaddr = (unsigned long)page->owner & PAGE_MASK; + if (page->flags & SGX_EPC_PAGE_KVM_GUEST) + vaddr = (unsigned long)page->owner & PAGE_MASK; + else + vaddr = (unsigned long)page->owner->desc & PAGE_MASK; ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)vaddr, PAGE_SHIFT); if (ret < 0)