Message ID | 27dcd2781a450b3f77a2aec833de6a3669bc0fb8.1670566861.git.kai.huang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | TDX host kernel support | expand |
On 12/8/22 22:52, Kai Huang wrote: > +static int tdmr_add_rsvd_area(struct tdmr_info *tdmr, int *p_idx, u64 addr, > + u64 size, u16 max_reserved_per_tdmr) > +{ > + struct tdmr_reserved_area *rsvd_areas = tdmr->reserved_areas; > + int idx = *p_idx; > + > + /* Reserved area must be 4K aligned in offset and size */ > + if (WARN_ON(addr & ~PAGE_MASK || size & ~PAGE_MASK)) > + return -EINVAL; > + > + if (idx >= max_reserved_per_tdmr) > + return -E2BIG; > + > + rsvd_areas[idx].offset = addr - tdmr->base; > + rsvd_areas[idx].size = size; > + > + *p_idx = idx + 1; > + > + return 0; > +} It's probably worth at least a comment here to say: /* * Consume one reserved area per call. Make no effort to * optimize or reduce the number of reserved areas which are * consumed by contiguous reserved areas, for instance. */ I think the -E2BIG is also wrong. It should be ENOSPC. I'd also add a pr_warn() there. Especially with how lazy this whole thing is, I can start to see how the reserved areas might be exhausted. Let's be kind to our future selves and make the error (and the fix) easier to find. It's probably also worth noting *somewhere* that there's a balance to be had between TDMRs and reserved areas. A system that is running out of reserved areas in a TDMR could split a TDMR to get more reserved areas. A system that has run out of TDMRs could relatively easily coalesce two adjacent TDMRs (before the PAMTs are allocated) and use a reserved area if there was a gap between them. I'm *really* close to acking this patch once those are fixed up.
On Fri, 2023-01-06 at 14:07 -0800, Dave Hansen wrote: > On 12/8/22 22:52, Kai Huang wrote: > > +static int tdmr_add_rsvd_area(struct tdmr_info *tdmr, int *p_idx, u64 addr, > > + u64 size, u16 max_reserved_per_tdmr) > > +{ > > + struct tdmr_reserved_area *rsvd_areas = tdmr->reserved_areas; > > + int idx = *p_idx; > > + > > + /* Reserved area must be 4K aligned in offset and size */ > > + if (WARN_ON(addr & ~PAGE_MASK || size & ~PAGE_MASK)) > > + return -EINVAL; > > + > > + if (idx >= max_reserved_per_tdmr) > > + return -E2BIG; > > + > > + rsvd_areas[idx].offset = addr - tdmr->base; > > + rsvd_areas[idx].size = size; > > + > > + *p_idx = idx + 1; > > + > > + return 0; > > +} > > It's probably worth at least a comment here to say: > > /* > * Consume one reserved area per call. Make no effort to > * optimize or reduce the number of reserved areas which are > * consumed by contiguous reserved areas, for instance. > */ I'll add this comment before the code to set up rsvd_areas[idx]. > > I think the -E2BIG is also wrong. It should be ENOSPC. I'd also add a > pr_warn() there. Especially with how lazy this whole thing is, I can > start to see how the reserved areas might be exhausted. Let's be kind > to our future selves and make the error (and the fix) easier to find. Yes agreed. Will change to -ENOSPC and add pr_warn(). > It's probably also worth noting *somewhere* that there's a balance to be > had between TDMRs and reserved areas. A system that is running out of > reserved areas in a TDMR could split a TDMR to get more reserved areas. > A system that has run out of TDMRs could relatively easily coalesce two > adjacent TDMRs (before the PAMTs are allocated) and use a reserved area > if there was a gap between them. We can add above to the changelog of this patch, or the patch 09 ("x86/virt/tdx: Fill out TDMRs to cover all TDX memory regions"). The latter perhaps is better since that patch is the first place where the balance of TDMRs and reserved areas is related. What is your suggestion? > > I'm *really* close to acking this patch once those are fixed up.
On 1/9/23 17:19, Huang, Kai wrote: >> It's probably also worth noting *somewhere* that there's a balance to be >> had between TDMRs and reserved areas. A system that is running out of >> reserved areas in a TDMR could split a TDMR to get more reserved areas. >> A system that has run out of TDMRs could relatively easily coalesce two >> adjacent TDMRs (before the PAMTs are allocated) and use a reserved area >> if there was a gap between them. > We can add above to the changelog of this patch, or the patch 09 ("x86/virt/tdx: > Fill out TDMRs to cover all TDX memory regions"). The latter perhaps is better > since that patch is the first place where the balance of TDMRs and reserved > areas is related. > > What is your suggestion? Just put it close to the code that actually hits the problem so the potential solution is close at hand to whoever hits the problem.
On Tue, 2023-01-10 at 01:19 +0000, Huang, Kai wrote: > > I think the -E2BIG is also wrong. It should be ENOSPC. I'd also add a > > pr_warn() there. Especially with how lazy this whole thing is, I can > > start to see how the reserved areas might be exhausted. Let's be kind > > to our future selves and make the error (and the fix) easier to find. > > Yes agreed. Will change to -ENOSPC and add pr_warn(). Btw, in patch ("x86/virt/tdx: Fill out TDMRs to cover all TDX memory regions"), when there are too many TDMRs, I suppose I should also return -ENOSPC instead of -E2BIG?
On Mon, 2023-01-09 at 17:22 -0800, Dave Hansen wrote: > On 1/9/23 17:19, Huang, Kai wrote: > > > It's probably also worth noting *somewhere* that there's a balance to be > > > had between TDMRs and reserved areas. A system that is running out of > > > reserved areas in a TDMR could split a TDMR to get more reserved areas. > > > A system that has run out of TDMRs could relatively easily coalesce two > > > adjacent TDMRs (before the PAMTs are allocated) and use a reserved area > > > if there was a gap between them. > > We can add above to the changelog of this patch, or the patch 09 ("x86/virt/tdx: > > Fill out TDMRs to cover all TDX memory regions"). The latter perhaps is better > > since that patch is the first place where the balance of TDMRs and reserved > > areas is related. > > > > What is your suggestion? > > Just put it close to the code that actually hits the problem so the > potential solution is close at hand to whoever hits the problem. > Sorry to double check: the code which hits the problem is the 'if (idx >= max_reserved_per_tdmr)' check in tdmr_add_rsvd_area(), so I think I can add right before this check?
On 1/10/23 03:01, Huang, Kai wrote: > On Tue, 2023-01-10 at 01:19 +0000, Huang, Kai wrote: >>> I think the -E2BIG is also wrong. It should be ENOSPC. I'd also add a >>> pr_warn() there. Especially with how lazy this whole thing is, I can >>> start to see how the reserved areas might be exhausted. Let's be kind >>> to our future selves and make the error (and the fix) easier to find. >> Yes agreed. Will change to -ENOSPC and add pr_warn(). > Btw, in patch ("x86/virt/tdx: Fill out TDMRs to cover all TDX memory regions"), > when there are too many TDMRs, I suppose I should also return -ENOSPC instead of > -E2BIG? Yes.
On 1/10/23 03:01, Huang, Kai wrote: > On Mon, 2023-01-09 at 17:22 -0800, Dave Hansen wrote: >> On 1/9/23 17:19, Huang, Kai wrote: >>>> It's probably also worth noting *somewhere* that there's a balance to be >>>> had between TDMRs and reserved areas. A system that is running out of >>>> reserved areas in a TDMR could split a TDMR to get more reserved areas. >>>> A system that has run out of TDMRs could relatively easily coalesce two >>>> adjacent TDMRs (before the PAMTs are allocated) and use a reserved area >>>> if there was a gap between them. >>> We can add above to the changelog of this patch, or the patch 09 ("x86/virt/tdx: >>> Fill out TDMRs to cover all TDX memory regions"). The latter perhaps is better >>> since that patch is the first place where the balance of TDMRs and reserved >>> areas is related. >>> >>> What is your suggestion? >> Just put it close to the code that actually hits the problem so the >> potential solution is close at hand to whoever hits the problem. >> > Sorry to double check: the code which hits the problem is the 'if (idx >= > max_reserved_per_tdmr)' check in tdmr_add_rsvd_area(), so I think I can add > right before this check? Please just hack together how you think it should look and either reply with an updated patch, or paste the relevant code snippet in your reply. That'll keep me from having to go chase this code back down.
On Tue, 2023-01-10 at 07:19 -0800, Dave Hansen wrote: > On 1/10/23 03:01, Huang, Kai wrote: > > On Mon, 2023-01-09 at 17:22 -0800, Dave Hansen wrote: > > > On 1/9/23 17:19, Huang, Kai wrote: > > > > > It's probably also worth noting *somewhere* that there's a balance to be > > > > > had between TDMRs and reserved areas. A system that is running out of > > > > > reserved areas in a TDMR could split a TDMR to get more reserved areas. > > > > > A system that has run out of TDMRs could relatively easily coalesce two > > > > > adjacent TDMRs (before the PAMTs are allocated) and use a reserved area > > > > > if there was a gap between them. > > > > We can add above to the changelog of this patch, or the patch 09 ("x86/virt/tdx: > > > > Fill out TDMRs to cover all TDX memory regions"). The latter perhaps is better > > > > since that patch is the first place where the balance of TDMRs and reserved > > > > areas is related. > > > > > > > > What is your suggestion? > > > Just put it close to the code that actually hits the problem so the > > > potential solution is close at hand to whoever hits the problem. > > > > > Sorry to double check: the code which hits the problem is the 'if (idx >= > > max_reserved_per_tdmr)' check in tdmr_add_rsvd_area(), so I think I can add > > right before this check? > > Please just hack together how you think it should look and either reply > with an updated patch, or paste the relevant code snippet in your reply. > That'll keep me from having to go chase this code back down. > Thanks for the tip. How about below? static int tdmr_add_rsvd_area(struct tdmr_info *tdmr, int *p_idx, u64 addr, u64 size, u16 max_reserved_per_tdmr) { struct tdmr_reserved_area *rsvd_areas = tdmr->reserved_areas; int idx = *p_idx; /* Reserved area must be 4K aligned in offset and size */ if (WARN_ON(addr & ~PAGE_MASK || size & ~PAGE_MASK)) return -EINVAL; /* * The TDX module supports only limited number of TDMRs and * limited number of reserved areas for each TDMR. There's a * balance to be had between TDMRs and reserved areas. A system * that is running out of reserved areas in a TDMR could split a * TDMR to get more reserved areas. A system that has run out * of TDMRs could relatively easily coalesce two adjacent TDMRs * (before the PAMTs are allocated) and use a reserved area if * there was a gap between them. */ if (idx >= max_reserved_per_tdmr) { pr_warn("too many reserved areas for TDMR [0x%llx, 0x%llx)\n", tdmr->base, tdmr_end(tdmr)); return -ENOSPC; } /* * Consume one reserved area per call. Make no effort to * optimize or reduce the number of reserved areas which are * consumed by contiguous reserved areas, for instance. */ rsvd_areas[idx].offset = addr - tdmr->base; rsvd_areas[idx].size = size; *p_idx = idx + 1; return 0; }
On 1/11/23 02:57, Huang, Kai wrote: > On Tue, 2023-01-10 at 07:19 -0800, Dave Hansen wrote: >> On 1/10/23 03:01, Huang, Kai wrote: >>> On Mon, 2023-01-09 at 17:22 -0800, Dave Hansen wrote: >>>> On 1/9/23 17:19, Huang, Kai wrote: >>>>>> It's probably also worth noting *somewhere* that there's a balance to be >>>>>> had between TDMRs and reserved areas. A system that is running out of >>>>>> reserved areas in a TDMR could split a TDMR to get more reserved areas. >>>>>> A system that has run out of TDMRs could relatively easily coalesce two >>>>>> adjacent TDMRs (before the PAMTs are allocated) and use a reserved area >>>>>> if there was a gap between them. >>>>> We can add above to the changelog of this patch, or the patch 09 ("x86/virt/tdx: >>>>> Fill out TDMRs to cover all TDX memory regions"). The latter perhaps is better >>>>> since that patch is the first place where the balance of TDMRs and reserved >>>>> areas is related. >>>>> >>>>> What is your suggestion? >>>> Just put it close to the code that actually hits the problem so the >>>> potential solution is close at hand to whoever hits the problem. >>>> >>> Sorry to double check: the code which hits the problem is the 'if (idx >= >>> max_reserved_per_tdmr)' check in tdmr_add_rsvd_area(), so I think I can add >>> right before this check? >> >> Please just hack together how you think it should look and either reply >> with an updated patch, or paste the relevant code snippet in your reply. >> That'll keep me from having to go chase this code back down. >> > > Thanks for the tip. How about below? > > static int tdmr_add_rsvd_area(struct tdmr_info *tdmr, int *p_idx, u64 addr, > u64 size, u16 max_reserved_per_tdmr) > { > struct tdmr_reserved_area *rsvd_areas = tdmr->reserved_areas; > int idx = *p_idx; > > /* Reserved area must be 4K aligned in offset and size */ > if (WARN_ON(addr & ~PAGE_MASK || size & ~PAGE_MASK)) > return -EINVAL; > > /* > * The TDX module supports only limited number of TDMRs and > * limited number of reserved areas for each TDMR. There's a > * balance to be had between TDMRs a2nd reserved areas. A system > * that is running out of reserved areas in a TDMR could split a > * TDMR to get more reserved areas. A system that has run out > * of TDMRs could relatively easily coalesce two adjacent TDMRs > * (before the PAMTs are allocated) and use a reserved area if > * there was a gap between them. > */ > if (idx >= max_reserved_per_tdmr) { > pr_warn("too many reserved areas for TDMR [0x%llx, 0x%llx)\n", > tdmr->base, tdmr_end(tdmr)); > return -ENOSPC; > } This isn't really converging on a solution. At this point, I just see my verbatim text being copied and pasted into these functions without really anything additional. This comment, for instance, just blathers about what could be done but doesn't actually explain what it is doing here. But, again, this isn't converging. It's just thrashing and not getting any better. I guess I'll just fix it up best I can when I apply it.
On Wed, 2023-01-11 at 08:16 -0800, Dave Hansen wrote: > On 1/11/23 02:57, Huang, Kai wrote: > > On Tue, 2023-01-10 at 07:19 -0800, Dave Hansen wrote: > > > On 1/10/23 03:01, Huang, Kai wrote: > > > > On Mon, 2023-01-09 at 17:22 -0800, Dave Hansen wrote: > > > > > On 1/9/23 17:19, Huang, Kai wrote: > > > > > > > It's probably also worth noting *somewhere* that there's a balance to be > > > > > > > had between TDMRs and reserved areas. A system that is running out of > > > > > > > reserved areas in a TDMR could split a TDMR to get more reserved areas. > > > > > > > A system that has run out of TDMRs could relatively easily coalesce two > > > > > > > adjacent TDMRs (before the PAMTs are allocated) and use a reserved area > > > > > > > if there was a gap between them. > > > > > > We can add above to the changelog of this patch, or the patch 09 ("x86/virt/tdx: > > > > > > Fill out TDMRs to cover all TDX memory regions"). The latter perhaps is better > > > > > > since that patch is the first place where the balance of TDMRs and reserved > > > > > > areas is related. > > > > > > > > > > > > What is your suggestion? > > > > > Just put it close to the code that actually hits the problem so the > > > > > potential solution is close at hand to whoever hits the problem. > > > > > > > > > Sorry to double check: the code which hits the problem is the 'if (idx >= > > > > max_reserved_per_tdmr)' check in tdmr_add_rsvd_area(), so I think I can add > > > > right before this check? > > > > > > Please just hack together how you think it should look and either reply > > > with an updated patch, or paste the relevant code snippet in your reply. > > > That'll keep me from having to go chase this code back down. > > > > > > > Thanks for the tip. How about below? > > > > static int tdmr_add_rsvd_area(struct tdmr_info *tdmr, int *p_idx, u64 addr, > > u64 size, u16 max_reserved_per_tdmr) > > { > > struct tdmr_reserved_area *rsvd_areas = tdmr->reserved_areas; > > int idx = *p_idx; > > > > /* Reserved area must be 4K aligned in offset and size */ > > if (WARN_ON(addr & ~PAGE_MASK || size & ~PAGE_MASK)) > > return -EINVAL; > > > > /* > > * The TDX module supports only limited number of TDMRs and > > * limited number of reserved areas for each TDMR. There's a > > * balance to be had between TDMRs a2nd reserved areas. A system > > * that is running out of reserved areas in a TDMR could split a > > * TDMR to get more reserved areas. A system that has run out > > * of TDMRs could relatively easily coalesce two adjacent TDMRs > > * (before the PAMTs are allocated) and use a reserved area if > > * there was a gap between them. > > */ > > if (idx >= max_reserved_per_tdmr) { > > pr_warn("too many reserved areas for TDMR [0x%llx, 0x%llx)\n", > > tdmr->base, tdmr_end(tdmr)); > > return -ENOSPC; > > } > > This isn't really converging on a solution. At this point, I just see > my verbatim text being copied and pasted into these functions without > really anything additional. > > This comment, for instance, just blathers about what could be done but > doesn't actually explain what it is doing here. > > But, again, this isn't converging. It's just thrashing and not getting > any better. I guess I'll just fix it up best I can when I apply it. Appreciate your help!
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index cf970a783f1f..620b35e2a61b 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -21,6 +21,7 @@ #include <linux/sizes.h> #include <linux/pfn.h> #include <linux/align.h> +#include <linux/sort.h> #include <asm/pgtable_types.h> #include <asm/msr.h> #include <asm/tdx.h> @@ -687,6 +688,202 @@ static unsigned long tdmrs_count_pamt_pages(struct tdmr_info_list *tdmr_list) return pamt_npages; } +static int tdmr_add_rsvd_area(struct tdmr_info *tdmr, int *p_idx, u64 addr, + u64 size, u16 max_reserved_per_tdmr) +{ + struct tdmr_reserved_area *rsvd_areas = tdmr->reserved_areas; + int idx = *p_idx; + + /* Reserved area must be 4K aligned in offset and size */ + if (WARN_ON(addr & ~PAGE_MASK || size & ~PAGE_MASK)) + return -EINVAL; + + if (idx >= max_reserved_per_tdmr) + return -E2BIG; + + rsvd_areas[idx].offset = addr - tdmr->base; + rsvd_areas[idx].size = size; + + *p_idx = idx + 1; + + return 0; +} + +/* + * Go through @tmb_list to find holes between memory areas. If any of + * those holes fall within @tdmr, set up a TDMR reserved area to cover + * the hole. + */ +static int tdmr_populate_rsvd_holes(struct list_head *tmb_list, + struct tdmr_info *tdmr, + int *rsvd_idx, + u16 max_reserved_per_tdmr) +{ + struct tdx_memblock *tmb; + u64 prev_end; + int ret; + + /* + * Start looking for reserved blocks at the + * beginning of the TDMR. + */ + prev_end = tdmr->base; + list_for_each_entry(tmb, tmb_list, list) { + u64 start, end; + + start = PFN_PHYS(tmb->start_pfn); + end = PFN_PHYS(tmb->end_pfn); + + /* Break if this region is after the TDMR */ + if (start >= tdmr_end(tdmr)) + break; + + /* Exclude regions before this TDMR */ + if (end < tdmr->base) + continue; + + /* + * Skip over memory areas that + * have already been dealt with. + */ + if (start <= prev_end) { + prev_end = end; + continue; + } + + /* Add the hole before this region */ + ret = tdmr_add_rsvd_area(tdmr, rsvd_idx, prev_end, + start - prev_end, + max_reserved_per_tdmr); + if (ret) + return ret; + + prev_end = end; + } + + /* Add the hole after the last region if it exists. */ + if (prev_end < tdmr_end(tdmr)) { + ret = tdmr_add_rsvd_area(tdmr, rsvd_idx, prev_end, + tdmr_end(tdmr) - prev_end, + max_reserved_per_tdmr); + if (ret) + return ret; + } + + return 0; +} + +/* + * Go through @tdmr_list to find all PAMTs. If any of those PAMTs + * overlaps with @tdmr, set up a TDMR reserved area to cover the + * overlapping part. + */ +static int tdmr_populate_rsvd_pamts(struct tdmr_info_list *tdmr_list, + struct tdmr_info *tdmr, + int *rsvd_idx, + u16 max_reserved_per_tdmr) +{ + int i, ret; + + for (i = 0; i < tdmr_list->nr_tdmrs; i++) { + struct tdmr_info *tmp = tdmr_entry(tdmr_list, i); + unsigned long pamt_start_pfn, pamt_npages; + u64 pamt_start, pamt_end; + + tdmr_get_pamt(tmp, &pamt_start_pfn, &pamt_npages); + /* Each TDMR must already have PAMT allocated */ + WARN_ON_ONCE(!pamt_npages || !pamt_start_pfn); + + pamt_start = PFN_PHYS(pamt_start_pfn); + pamt_end = PFN_PHYS(pamt_start_pfn + pamt_npages); + + /* Skip PAMTs outside of the given TDMR */ + if ((pamt_end <= tdmr->base) || + (pamt_start >= tdmr_end(tdmr))) + continue; + + /* Only mark the part within the TDMR as reserved */ + if (pamt_start < tdmr->base) + pamt_start = tdmr->base; + if (pamt_end > tdmr_end(tdmr)) + pamt_end = tdmr_end(tdmr); + + ret = tdmr_add_rsvd_area(tdmr, rsvd_idx, pamt_start, + pamt_end - pamt_start, + max_reserved_per_tdmr); + if (ret) + return ret; + } + + return 0; +} + +/* Compare function called by sort() for TDMR reserved areas */ +static int rsvd_area_cmp_func(const void *a, const void *b) +{ + struct tdmr_reserved_area *r1 = (struct tdmr_reserved_area *)a; + struct tdmr_reserved_area *r2 = (struct tdmr_reserved_area *)b; + + if (r1->offset + r1->size <= r2->offset) + return -1; + if (r1->offset >= r2->offset + r2->size) + return 1; + + /* Reserved areas cannot overlap. The caller must guarantee. */ + WARN_ON_ONCE(1); + return -1; +} + +/* + * Populate reserved areas for the given @tdmr, including memory holes + * (via @tmb_list) and PAMTs (via @tdmr_list). + */ +static int tdmr_populate_rsvd_areas(struct tdmr_info *tdmr, + struct list_head *tmb_list, + struct tdmr_info_list *tdmr_list, + u16 max_reserved_per_tdmr) +{ + int ret, rsvd_idx = 0; + + ret = tdmr_populate_rsvd_holes(tmb_list, tdmr, &rsvd_idx, + max_reserved_per_tdmr); + if (ret) + return ret; + + ret = tdmr_populate_rsvd_pamts(tdmr_list, tdmr, &rsvd_idx, + max_reserved_per_tdmr); + if (ret) + return ret; + + /* TDX requires reserved areas listed in address ascending order */ + sort(tdmr->reserved_areas, rsvd_idx, sizeof(struct tdmr_reserved_area), + rsvd_area_cmp_func, NULL); + + return 0; +} + +/* + * Populate reserved areas for all TDMRs in @tdmr_list, including memory + * holes (via @tmb_list) and PAMTs. + */ +static int tdmrs_populate_rsvd_areas_all(struct tdmr_info_list *tdmr_list, + struct list_head *tmb_list, + u16 max_reserved_per_tdmr) +{ + int i; + + for (i = 0; i < tdmr_list->nr_tdmrs; i++) { + int ret; + + ret = tdmr_populate_rsvd_areas(tdmr_entry(tdmr_list, i), + tmb_list, tdmr_list, max_reserved_per_tdmr); + if (ret) + return ret; + } + + return 0; +} + /* * Construct a list of TDMRs on the preallocated space in @tdmr_list * to cover all TDX memory regions in @tmb_list based on the TDX module @@ -706,14 +903,14 @@ static int construct_tdmrs(struct list_head *tmb_list, sysinfo->pamt_entry_size); if (ret) goto err; - /* - * TODO: - * - * - Designate reserved areas for each TDMR. - * - * Return -EINVAL until constructing TDMRs is done - */ - ret = -EINVAL; + + ret = tdmrs_populate_rsvd_areas_all(tdmr_list, tmb_list, + sysinfo->max_reserved_per_tdmr); + if (ret) + goto err_free_pamts; + + return 0; +err_free_pamts: tdmrs_free_pamt_all(tdmr_list); err: return ret;