From patchwork Fri Dec 9 06:52:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Kai" X-Patchwork-Id: 13069303 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55D7AC4332F for ; Fri, 9 Dec 2022 06:53:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E5C918E0008; Fri, 9 Dec 2022 01:53:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E0EA98E0006; Fri, 9 Dec 2022 01:53:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAE688E0008; Fri, 9 Dec 2022 01:53:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id BE0EA8E0006 for ; Fri, 9 Dec 2022 01:53:49 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9025714119A for ; Fri, 9 Dec 2022 06:53:49 +0000 (UTC) X-FDA: 80221852578.17.86CCC99 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf20.hostedemail.com (Postfix) with ESMTP id A545B1C0002 for ; Fri, 9 Dec 2022 06:53:47 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=GIZEg85z; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf20.hostedemail.com: domain of kai.huang@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=kai.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670568828; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2bbciYV/Qi081sskJgc/4Dh2QOE0HCmyEYhuVJC72UQ=; b=gPiPp5D6/icvnx97zVQGO8VsCNI5EMNXyK2Ta8ELZZH+WYwHtYbZAUhDH5/XnKdCKwaIkt 22yB8hpjN4GSV0k51AFSb+eTjqjODKJBUHvlE2kzVmicsDSgICAUaIziFU7afg/uWS12q8 fwWhOdUCbUG4VYwaiY7GMlhfwYRX4YM= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=GIZEg85z; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf20.hostedemail.com: domain of kai.huang@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=kai.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670568828; a=rsa-sha256; cv=none; b=q4jm5xyvcSiNYJNAfedmyTySkNy1ksVABFuizassex5xA9KVn9S4x+oH8qwOBdqdRFtTds KH8KuC+iyWcoCvoc6auzOUu380qrhV2sDSN2iVju2bfNOT7CO9hQ04IDtF+aOBN7MQmtqm 6Ej3jjxQZyWG9I0Z4EbXjm6yN6md0/M= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670568827; x=1702104827; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=576NLGHg2IB8blsMFnuFfzPk7D8+hfcrsAlrHJGxFxU=; b=GIZEg85zozi5BV9VFL/+OyypjANUlbwIC2+NDSybb4rrVC/RrUh+TMGi G8V88McS1A1UZzUBwjMLfJwQy/ZBbA54cGwjOcelA5/mMSMqDx1cDI9QV fd91W9nO117s8MusSSG0qDlE713QCj+9Xp013MUQOCDRxI/BwwPwAWKD4 4XQ5T0YasUD5WPz774BXYF3hJnmzG06qCPZVcADY25TTn8/ANuWjfEAkb 0ygpe/TEiriepggbdQ6F0k5OXLbcj6WAHomZxCVTR9ipC0hiJuW7MooMt w4HN/3k+f0v/myyYbuJWJ0xFel+sCd3PWpEXl/sWjC9xV0+HFDhBzUCCQ Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10555"; a="318551393" X-IronPort-AV: E=Sophos;i="5.96,230,1665471600"; d="scan'208";a="318551393" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Dec 2022 22:53:46 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10555"; a="679836990" X-IronPort-AV: E=Sophos;i="5.96,230,1665471600"; d="scan'208";a="679836990" Received: from omiramon-mobl1.amr.corp.intel.com (HELO khuang2-desk.gar.corp.intel.com) ([10.212.28.82]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Dec 2022 22:53:41 -0800 From: Kai Huang To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: linux-mm@kvack.org, dave.hansen@intel.com, peterz@infradead.org, tglx@linutronix.de, seanjc@google.com, pbonzini@redhat.com, dan.j.williams@intel.com, rafael.j.wysocki@intel.com, kirill.shutemov@linux.intel.com, ying.huang@intel.com, reinette.chatre@intel.com, len.brown@intel.com, tony.luck@intel.com, ak@linux.intel.com, isaku.yamahata@intel.com, chao.gao@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, bagasdotme@gmail.com, sagis@google.com, imammedo@redhat.com, kai.huang@intel.com Subject: [PATCH v8 09/16] x86/virt/tdx: Fill out TDMRs to cover all TDX memory regions Date: Fri, 9 Dec 2022 19:52:30 +1300 Message-Id: <6f9c0bc1074501fa2431bde73bdea57279bf0085.1670566861.git.kai.huang@intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: A545B1C0002 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: urjo8anckbcusuk6hxesbtnpi8cehd5y X-HE-Tag: 1670568827-518176 X-HE-Meta: U2FsdGVkX1/kA9I22jEqBSX5NubwnhbtDdWQtoKcCDKQQ4U+16vla9FccipCyE/Rd2T6fDOEKsrreXL1ZrevDs4CxcHH0gbb1nYxlyemXIR7fHQXbw64WEIMQb+MlY8Fy5ZKmxF2XDITY11m/ksDxAm/ZTGWjIl/OpXkHaQgZvNM8xelHl/zzK+meyb1i5NVdLUjO70+ZKolhgFyLP4Oy4atFHipL3L5GAZyc0HIDu70YURx35tvVF4s+kmpMfJ6D8+wfm1g+9nQUkdnVQgaL9E/rjvdJZfVEZPEEontnA9pATLt6X6bzs1eTQp7qWyz1vciNyVHYGj6xwRbFUdp6WAeqIjMfyLzkZoBqdsfEm/bkIei3FJAPI3PNaNdjI+FHwtiOsFYa7O9pMVjh+lku0Yp77l5ny13id67HtM1EwNYNJcv+8J4JVw9NMXZJbWzcUQ5Dxrm/nGS0Qev4aGXhmtsfXYPnaQQNtpb2NtxC4FAc11W0I+XTi5DCFhOYobXuOT4u/+8jffZRgFcK+DCR9WUwHP8/EMpPCQiMQOmGSB6RYDY0coU8q9T/Jg2nLHCQOQ5MXXpaZ+f3xxV4AmnGmDty7JgwtKWOJgcISpfTRvY0KgbmWdzVa/M21ZThbV+vSMXu0+971PDOoCuz1jUycsl0I8ANfQEO4KcejQ22DMxZwaD5yr6tJJZwWnG/pr3v8j8WusPYmM8m9KtAKA/trgsxmcEdoOclRwGiq6bBGaGXQxDRxcIKOZYOJdbrezdJYTuJSZ7JtXQx1lNp0hsBadEwKArYc1aXEIggvH3uLy8Acwja497GejdUCO5BJTO5PymzHjWtkJK6ZaQNMmBBso1M6eBq07fNsq3x3Jc7A/zNBEbgoLePeiZuiYla8vIuUAnL/gKe36vyZzApgDydw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Start to transit out the "multi-steps" to construct a list of "TD Memory Regions" (TDMRs) to cover all TDX-usable memory regions. The kernel configures TDX-usable memory regions by passing a list of TDMRs "TD Memory Regions" (TDMRs) to the TDX module. Each TDMR contains the information of the base/size of a memory region, the base/size of the associated Physical Address Metadata Table (PAMT) and a list of reserved areas in the region. Do the first step to fill out a number of TDMRs to cover all TDX memory regions. To keep it simple, always try to use one TDMR for each memory region. As the first step only set up the base/size for each TDMR. Each TDMR must be 1G aligned and the size must be in 1G granularity. This implies that one TDMR could cover multiple memory regions. If a memory region spans the 1GB boundary and the former part is already covered by the previous TDMR, just use a new TDMR for the remaining part. TDX only supports a limited number of TDMRs. Disable TDX if all TDMRs are consumed but there is more memory region to cover. Signed-off-by: Kai Huang --- v7 -> v8: (Dave) - Add a sentence to changelog stating this is the first patch to transit "multi-steps" of constructing TDMRs. - Added a comment to explain "why one TDMR for each memory region" is OK for now. - Trimed down/removed unnecessary comments. - Removed tdmr_start() but use tdmr->base directly - create_tdmrs() -> fill_out_tdmrs() - Other changes due to introducing 'struct tdmr_info_list'. v6 -> v7: - No change. v5 -> v6: - Rebase due to using 'tdx_memblock' instead of memblock. - v3 -> v5 (no feedback on v4): - Removed allocating TDMR individually. - Improved changelog by using Dave's words. - Made TDMR_START() and TDMR_END() as static inline function. --- arch/x86/virt/vmx/tdx/tdx.c | 95 ++++++++++++++++++++++++++++++++++++- 1 file changed, 93 insertions(+), 2 deletions(-) diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index d36ac72ef299..5b1de0200c6b 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -407,6 +407,90 @@ static void free_tdmr_list(struct tdmr_info_list *tdmr_list) tdmr_list->max_tdmrs * tdmr_list->tdmr_sz); } +/* Get the TDMR from the list at the given index. */ +static struct tdmr_info *tdmr_entry(struct tdmr_info_list *tdmr_list, + int idx) +{ + return (struct tdmr_info *)((unsigned long)tdmr_list->first_tdmr + + tdmr_list->tdmr_sz * idx); +} + +#define TDMR_ALIGNMENT BIT_ULL(30) +#define TDMR_PFN_ALIGNMENT (TDMR_ALIGNMENT >> PAGE_SHIFT) +#define TDMR_ALIGN_DOWN(_addr) ALIGN_DOWN((_addr), TDMR_ALIGNMENT) +#define TDMR_ALIGN_UP(_addr) ALIGN((_addr), TDMR_ALIGNMENT) + +static inline u64 tdmr_end(struct tdmr_info *tdmr) +{ + return tdmr->base + tdmr->size; +} + +/* + * Take the memory referenced in @tmb_list and populate the + * preallocated @tdmr_list, following all the special alignment + * and size rules for TDMR. + */ +static int fill_out_tdmrs(struct list_head *tmb_list, + struct tdmr_info_list *tdmr_list) +{ + struct tdx_memblock *tmb; + int tdmr_idx = 0; + + /* + * Loop over TDX memory regions and fill out TDMRs to cover them. + * To keep it simple, always try to use one TDMR to cover one + * memory region. + * + * In practice TDX1.0 supports 64 TDMRs, which is big enough to + * cover all memory regions in reality if the admin doesn't use + * 'memmap' to create a bunch of discrete memory regions. When + * there's a real problem, enhancement can be done to merge TDMRs + * to reduce the final number of TDMRs. + */ + list_for_each_entry(tmb, tmb_list, list) { + struct tdmr_info *tdmr = tdmr_entry(tdmr_list, tdmr_idx); + u64 start, end; + + start = TDMR_ALIGN_DOWN(PFN_PHYS(tmb->start_pfn)); + end = TDMR_ALIGN_UP(PFN_PHYS(tmb->end_pfn)); + + /* + * A valid size indicates the current TDMR has already + * been filled out to cover the previous memory region(s). + */ + if (tdmr->size) { + /* + * Loop to the next if the current memory region + * has already been fully covered. + */ + if (end <= tdmr_end(tdmr)) + continue; + + /* Otherwise, skip the already covered part. */ + if (start < tdmr_end(tdmr)) + start = tdmr_end(tdmr); + + /* + * Create a new TDMR to cover the current memory + * region, or the remaining part of it. + */ + tdmr_idx++; + if (tdmr_idx >= tdmr_list->max_tdmrs) + return -E2BIG; + + tdmr = tdmr_entry(tdmr_list, tdmr_idx); + } + + tdmr->base = start; + tdmr->size = end - start; + } + + /* @tdmr_idx is always the index of last valid TDMR. */ + tdmr_list->nr_tdmrs = tdmr_idx + 1; + + return 0; +} + /* * Construct a list of TDMRs on the preallocated space in @tdmr_list * to cover all TDX memory regions in @tmb_list based on the TDX module @@ -416,16 +500,23 @@ static int construct_tdmrs(struct list_head *tmb_list, struct tdmr_info_list *tdmr_list, struct tdsysinfo_struct *sysinfo) { + int ret; + + ret = fill_out_tdmrs(tmb_list, tdmr_list); + if (ret) + goto err; + /* * TODO: * - * - Fill out TDMRs to cover all TDX memory regions. * - Allocate and set up PAMTs for each TDMR. * - Designate reserved areas for each TDMR. * * Return -EINVAL until constructing TDMRs is done */ - return -EINVAL; + ret = -EINVAL; +err: + return ret; } static int init_tdx_module(void)