From patchwork Mon Jun 26 14:12:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Kai" X-Patchwork-Id: 13292986 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E71EEB64DA for ; Mon, 26 Jun 2023 14:15:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E1AB8D000E; Mon, 26 Jun 2023 10:15:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 143D78D0001; Mon, 26 Jun 2023 10:15:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EFFB58D000E; Mon, 26 Jun 2023 10:15:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D87D68D0001 for ; Mon, 26 Jun 2023 10:15:40 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B31E5AFCB7 for ; Mon, 26 Jun 2023 14:15:40 +0000 (UTC) X-FDA: 80945097240.14.995A5D5 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf14.hostedemail.com (Postfix) with ESMTP id EB29C100020 for ; Mon, 26 Jun 2023 14:15:36 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=M6DT9Lgw; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf14.hostedemail.com: domain of kai.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=kai.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687788937; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gZd4Fn1//90bmt1CfVPbhdjpwGBa+KbVenObWIJchTM=; b=BR/I+jBglkvwqMTO/B4ANiE1odOL3cB3mBJ7xDkkRDRC2nAbVaqR5TUeRkbrGcaX3+EeN/ bNP4vfkLP/nF9pUjoVXCP8/IfA5z2HGz6kkkgFYe6zSubADKiw/nMRBl89kl7pCn5pnc2U M186i8FSPCE51AVjpi7hsBqFq5LYtpA= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=M6DT9Lgw; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf14.hostedemail.com: domain of kai.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=kai.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687788937; a=rsa-sha256; cv=none; b=Sm+beC9aDu4XgmzjjpdNawSpeXhCPvThp+2thIe94zsY7Ub7Q7BlSgTJbZPPghtaQ3kGs3 xzR0x3XjfFRz/dwEwpXzZFpWyZLFqtx/cJe4r5CdxFYWYJ3zeVSkvKfOwsIdt3xUBf4BUo ytPuCuI/k4Aw3EAcv+DM0BoqpJoyVME= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1687788937; x=1719324937; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MdgIAiAKsm5b7IZW1m4wCtAmfDlp9uu+ITFZtSJwHDU=; b=M6DT9Lgw1CG2d0x1EPb95rAdYdy8klV9S8u+ROSgcZyIaSsHzzG/4fJs MhceLuqLF7WrC834KL/cL/292hviXFkc5dSwV5nL64XDTFOl57DQuKmUa RquMingDyaYI5iFWHbqG/VUUD0K5UPXdY3FKvSq84hBzPigW4BLtBwXqj S2ZT55e4Q0GSt84HoHzSp0YF5CpVPapJL1NmRv6E43icSnPtu4/5vpBE8 2YBkA1FfC2GjNwb33NgGr13Wym8dScviLdONPmDmz63UDMpTtJSPLHvtG YhlUlHYeC29uc32C6q91dpLr5RFNVpYa877MVIW6LVYAJ5Jxc7WqmtovR A==; X-IronPort-AV: E=McAfee;i="6600,9927,10753"; a="346034131" X-IronPort-AV: E=Sophos;i="6.01,159,1684825200"; d="scan'208";a="346034131" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jun 2023 07:15:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10753"; a="890292458" X-IronPort-AV: E=Sophos;i="6.01,159,1684825200"; d="scan'208";a="890292458" Received: from smithau-mobl1.amr.corp.intel.com (HELO khuang2-desk.gar.corp.intel.com) ([10.213.179.223]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jun 2023 07:15:29 -0700 From: Kai Huang To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, dave.hansen@intel.com, kirill.shutemov@linux.intel.com, tony.luck@intel.com, peterz@infradead.org, tglx@linutronix.de, bp@alien8.de, mingo@redhat.com, hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com, david@redhat.com, dan.j.williams@intel.com, rafael.j.wysocki@intel.com, ashok.raj@intel.com, reinette.chatre@intel.com, len.brown@intel.com, ak@linux.intel.com, isaku.yamahata@intel.com, ying.huang@intel.com, chao.gao@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, nik.borisov@suse.com, bagasdotme@gmail.com, sagis@google.com, imammedo@redhat.com, kai.huang@intel.com Subject: [PATCH v12 18/22] x86/virt/tdx: Keep TDMRs when module initialization is successful Date: Tue, 27 Jun 2023 02:12:48 +1200 Message-Id: <7d06fe5fda0e330895c1c9043b881f3c2a2d4f3f.1687784645.git.kai.huang@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: EB29C100020 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 1cdxycb61p7ufmotsyo85s8kmsbfcrbb X-HE-Tag: 1687788936-551777 X-HE-Meta: U2FsdGVkX1/xhRpYPc2XCFtPOzoIMouEbIiP+RnqarLYo5KRI+QXFfJEawkLjGH9Sr0Jwpi3Reo+hKvSZwJKotkih9LkdcEwplWek9xEhzSxarhXJUJIzmfwPnkqyVFjamLLJ7/9+9s/SRnIOgL2mWxDqdk51RqJwBdW/VPVjwVQ4ecL6HFc060CCLjrS3GslEGi+9PtLrpgwMxc9BjmJXh0qGoYvBdccG1o+iZ9rTuzx39pdzSTH8STAgNGifj3sMpFxdjwQB7uke+B2usC9TtVEdk+UeALQ2SdX2tHSg/Bm6EIMdjNSiKUJ0olOgundjNMFMHHSYzz+nK+ssx8IPyyQzbEpHpWfeHjpylWWx2ZCN/6DRPZgMPMe0dztNzguza3iv/uqAxyryNSu3rHsMXDqjKgaTp07WbaP6elUQoJh4Wq4JuuTjbRYLwtEmHRrq2AdaqPLqzRtw8t0BKYZ6e3D58xOYeEhb275+o9Io92BTSP7e5EXZzaigSY9sjjZDCTgLZ2G1VSkRessf+rkR0L7GzPeEsObxVL6I7xbH7I+Kiysnh9TXL/lJVVkp7+H9P6g0rp2YfzMYAEMj+j976Uz712eLounACrCmnFmow9Dw/NrBvGCsvLVqSYAT/2Ot6qukeFPqLCvQ3Mx2Rs3kt9X44WGsjSIIgNvIg+CHG9i0UlEzSyrrBp8f5oClBWL58mEDruDRwqKHM5x6gjubyiPWozolYCk1bsegfCfDX3bboe8ZFCedk0hc7KtnJ2woKeIFsEgY+upu5FUTxaf4hvkB3mz+qVy5//DD92/VgiOh45Uo7oBXRGAVurlo6KBEsxqstvhmdt2bEm9T7HB+q/eJnKkvY7FPD1qpJIjnUIWeE1VecvFn4GNn6FnIcKPX8JK9IpLGi9UA+yZdLysZDjB+Y7ebnSQAc38yWgVx5jGImB4w21Mx7H9BqF8+qsEHxP9dFiGR3nlWGsjOg powOHvm0 tO7NoNNX6WoQfWlc+9lHKBJVRQ6NEaVy7fRGCnL670zMbkDYkDozl2NZZBcUI3s0srAwTBQXQ8mlLmODCl+vLxDRbzgQw+gZHI5C2F372VCtOinsElbegYOHMcf+DFVis7mEXwMHLKAxcJeZ9bUd1/IaNeWMIgs6exbfe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On the platforms with the "partial write machine check" erratum, the kexec() needs to convert all TDX private pages back to normal before booting to the new kernel. Otherwise, the new kernel may get unexpected machine check. There's no existing infrastructure to track TDX private pages. Change to keep TDMRs when module initialization is successful so that they can be used to find PAMTs. With this change, only put_online_mems() and freeing the buffer of the TDSYSINFO_STRUCT and CMR array still need to be done even when module initialization is successful. Adjust the error handling to explicitly do them when module initialization is successful and unconditionally clean up the rest when initialization fails. Signed-off-by: Kai Huang --- v11 -> v12 (new patch): - Defer keeping TDMRs logic to this patch for better review - Improved error handling logic (Nikolay/Kirill in patch 15) --- arch/x86/virt/vmx/tdx/tdx.c | 84 ++++++++++++++++++------------------- 1 file changed, 42 insertions(+), 42 deletions(-) diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 52b7267ea226..85b24b2e9417 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -49,6 +49,8 @@ static DEFINE_MUTEX(tdx_module_lock); /* All TDX-usable memory regions. Protected by mem_hotplug_lock. */ static LIST_HEAD(tdx_memlist); +static struct tdmr_info_list tdx_tdmr_list; + /* * Wrapper of __seamcall() to convert SEAMCALL leaf function error code * to kernel error code. @seamcall_ret and @out contain the SEAMCALL @@ -1047,7 +1049,6 @@ static int init_tdmrs(struct tdmr_info_list *tdmr_list) static int init_tdx_module(void) { struct tdsysinfo_struct *sysinfo; - struct tdmr_info_list tdmr_list; struct cmr_info *cmr_array; int ret; @@ -1088,17 +1089,17 @@ static int init_tdx_module(void) goto out_put_tdxmem; /* Allocate enough space for constructing TDMRs */ - ret = alloc_tdmr_list(&tdmr_list, sysinfo); + ret = alloc_tdmr_list(&tdx_tdmr_list, sysinfo); if (ret) goto out_free_tdxmem; /* Cover all TDX-usable memory regions in TDMRs */ - ret = construct_tdmrs(&tdx_memlist, &tdmr_list, sysinfo); + ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, sysinfo); if (ret) goto out_free_tdmrs; /* Pass the TDMRs and the global KeyID to the TDX module */ - ret = config_tdx_module(&tdmr_list, tdx_global_keyid); + ret = config_tdx_module(&tdx_tdmr_list, tdx_global_keyid); if (ret) goto out_free_pamts; @@ -1118,51 +1119,50 @@ static int init_tdx_module(void) goto out_reset_pamts; /* Initialize TDMRs to complete the TDX module initialization */ - ret = init_tdmrs(&tdmr_list); + ret = init_tdmrs(&tdx_tdmr_list); + if (ret) + goto out_reset_pamts; + + pr_info("%lu KBs allocated for PAMT.\n", + tdmrs_count_pamt_kb(&tdx_tdmr_list)); + + /* + * @tdx_memlist is written here and read at memory hotplug time. + * Lock out memory hotplug code while building it. + */ + put_online_mems(); + /* + * For now both @sysinfo and @cmr_array are only used during + * module initialization, so always free them. + */ + free_page((unsigned long)sysinfo); + + return 0; out_reset_pamts: - if (ret) { - /* - * Part of PAMTs may already have been initialized by the - * TDX module. Flush cache before returning PAMTs back - * to the kernel. - */ - wbinvd_on_all_cpus(); - /* - * According to the TDX hardware spec, if the platform - * doesn't have the "partial write machine check" - * erratum, any kernel read/write will never cause #MC - * in kernel space, thus it's OK to not convert PAMTs - * back to normal. But do the conversion anyway here - * as suggested by the TDX spec. - */ - tdmrs_reset_pamt_all(&tdmr_list); - } + /* + * Part of PAMTs may already have been initialized by the + * TDX module. Flush cache before returning PAMTs back + * to the kernel. + */ + wbinvd_on_all_cpus(); + /* + * According to the TDX hardware spec, if the platform + * doesn't have the "partial write machine check" + * erratum, any kernel read/write will never cause #MC + * in kernel space, thus it's OK to not convert PAMTs + * back to normal. But do the conversion anyway here + * as suggested by the TDX spec. + */ + tdmrs_reset_pamt_all(&tdx_tdmr_list); out_free_pamts: - if (ret) - tdmrs_free_pamt_all(&tdmr_list); - else - pr_info("%lu KBs allocated for PAMT.\n", - tdmrs_count_pamt_kb(&tdmr_list)); + tdmrs_free_pamt_all(&tdx_tdmr_list); out_free_tdmrs: - /* - * Always free the buffer of TDMRs as they are only used during - * module initialization. - */ - free_tdmr_list(&tdmr_list); + free_tdmr_list(&tdx_tdmr_list); out_free_tdxmem: - if (ret) - free_tdx_memlist(&tdx_memlist); + free_tdx_memlist(&tdx_memlist); out_put_tdxmem: - /* - * @tdx_memlist is written here and read at memory hotplug time. - * Lock out memory hotplug code while building it. - */ put_online_mems(); out: - /* - * For now both @sysinfo and @cmr_array are only used during - * module initialization, so always free them. - */ free_page((unsigned long)sysinfo); return ret; }