Message ID | e0e2d865-5ac9-d7ac-c763-f4b99b699224@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | IOMMU: superpage support when not sharing pagetables | expand |
On Mon, Apr 25, 2022 at 10:43:16AM +0200, Jan Beulich wrote: > When a page table ends up with all contiguous entries (including all > identical attributes), it can be replaced by a superpage entry at the > next higher level. The page table itself can then be scheduled for > freeing. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > Unlike the freeing of all-empty page tables, this causes quite a bit of > back and forth for PV domains, due to their mapping/unmapping of pages > when they get converted to/from being page tables. It may therefore be > worth considering to delay re-coalescing a little, to avoid doing so > when the superpage would otherwise get split again pretty soon. But I > think this would better be the subject of a separate change anyway. > > Of course this could also be helped by more "aware" kernel side > behavior: They could avoid immediately mapping freed page tables > writable again, in anticipation of re-using that same page for another > page table elsewhere. > --- > v4: Re-base over changes earlier in the series. > v3: New. > > --- a/xen/drivers/passthrough/amd/iommu_map.c > +++ b/xen/drivers/passthrough/amd/iommu_map.c > @@ -81,7 +81,8 @@ static union amd_iommu_pte set_iommu_pte > unsigned long dfn, > unsigned long next_mfn, > unsigned int level, > - bool iw, bool ir) > + bool iw, bool ir, > + bool *contig) > { > union amd_iommu_pte *table, *pde, old; > > @@ -94,11 +95,15 @@ static union amd_iommu_pte set_iommu_pte > old.iw != iw || old.ir != ir ) > { > set_iommu_pde_present(pde, next_mfn, 0, iw, ir); > - pt_update_contig_markers(&table->raw, pfn_to_pde_idx(dfn, level), > - level, PTE_kind_leaf); > + *contig = pt_update_contig_markers(&table->raw, > + pfn_to_pde_idx(dfn, level), > + level, PTE_kind_leaf); > } > else > + { > old.pr = false; /* signal "no change" to the caller */ > + *contig = false; So we assume that any caller getting contig == true must have acted and coalesced the page table? Might be worth a comment, to note that the function assumes that a previous return of contig == true will have coalesced the page table and hence a "no change" PTE write is not expected to happen on a contig page table. Thanks, Roger.
On 10.05.2022 17:31, Roger Pau Monné wrote: > On Mon, Apr 25, 2022 at 10:43:16AM +0200, Jan Beulich wrote: >> @@ -94,11 +95,15 @@ static union amd_iommu_pte set_iommu_pte >> old.iw != iw || old.ir != ir ) >> { >> set_iommu_pde_present(pde, next_mfn, 0, iw, ir); >> - pt_update_contig_markers(&table->raw, pfn_to_pde_idx(dfn, level), >> - level, PTE_kind_leaf); >> + *contig = pt_update_contig_markers(&table->raw, >> + pfn_to_pde_idx(dfn, level), >> + level, PTE_kind_leaf); >> } >> else >> + { >> old.pr = false; /* signal "no change" to the caller */ >> + *contig = false; > > So we assume that any caller getting contig == true must have acted > and coalesced the page table? Yes, except that I wouldn't use "must", but "would". It's not a requirement after all, functionality-wise all will be fine without re-coalescing. > Might be worth a comment, to note that the function assumes that a > previous return of contig == true will have coalesced the page table > and hence a "no change" PTE write is not expected to happen on a > contig page table. I'm not convinced, as there's effectively only one caller, amd_iommu_map_page(). I also don't see why "no change" would be a problem. "No change" can't result in a fully contiguous page table if the page table wasn't fully contiguous already before (at which point it would have been replaced by a superpage). Jan
On Wed, May 18, 2022 at 12:40:59PM +0200, Jan Beulich wrote: > On 10.05.2022 17:31, Roger Pau Monné wrote: > > On Mon, Apr 25, 2022 at 10:43:16AM +0200, Jan Beulich wrote: > >> @@ -94,11 +95,15 @@ static union amd_iommu_pte set_iommu_pte > >> old.iw != iw || old.ir != ir ) > >> { > >> set_iommu_pde_present(pde, next_mfn, 0, iw, ir); > >> - pt_update_contig_markers(&table->raw, pfn_to_pde_idx(dfn, level), > >> - level, PTE_kind_leaf); > >> + *contig = pt_update_contig_markers(&table->raw, > >> + pfn_to_pde_idx(dfn, level), > >> + level, PTE_kind_leaf); > >> } > >> else > >> + { > >> old.pr = false; /* signal "no change" to the caller */ > >> + *contig = false; > > > > So we assume that any caller getting contig == true must have acted > > and coalesced the page table? > > Yes, except that I wouldn't use "must", but "would". It's not a > requirement after all, functionality-wise all will be fine without > re-coalescing. > > > Might be worth a comment, to note that the function assumes that a > > previous return of contig == true will have coalesced the page table > > and hence a "no change" PTE write is not expected to happen on a > > contig page table. > > I'm not convinced, as there's effectively only one caller, > amd_iommu_map_page(). I also don't see why "no change" would be a > problem. "No change" can't result in a fully contiguous page table > if the page table wasn't fully contiguous already before (at which > point it would have been replaced by a superpage). Right, I agree, it's just that I would have preferred the result from set_iommu_ptes_present() to be consistent, ie: repeated calls to it using the same PTE should set contig to the same value. Anyway, that's not relevant to any current callers, so: Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks, Roger.
--- a/xen/drivers/passthrough/amd/iommu_map.c +++ b/xen/drivers/passthrough/amd/iommu_map.c @@ -81,7 +81,8 @@ static union amd_iommu_pte set_iommu_pte unsigned long dfn, unsigned long next_mfn, unsigned int level, - bool iw, bool ir) + bool iw, bool ir, + bool *contig) { union amd_iommu_pte *table, *pde, old; @@ -94,11 +95,15 @@ static union amd_iommu_pte set_iommu_pte old.iw != iw || old.ir != ir ) { set_iommu_pde_present(pde, next_mfn, 0, iw, ir); - pt_update_contig_markers(&table->raw, pfn_to_pde_idx(dfn, level), - level, PTE_kind_leaf); + *contig = pt_update_contig_markers(&table->raw, + pfn_to_pde_idx(dfn, level), + level, PTE_kind_leaf); } else + { old.pr = false; /* signal "no change" to the caller */ + *contig = false; + } unmap_domain_page(table); @@ -407,6 +412,7 @@ int cf_check amd_iommu_map_page( { struct domain_iommu *hd = dom_iommu(d); unsigned int level = (IOMMUF_order(flags) / PTE_PER_TABLE_SHIFT) + 1; + bool contig; int rc; unsigned long pt_mfn = 0; union amd_iommu_pte old; @@ -447,8 +453,26 @@ int cf_check amd_iommu_map_page( /* Install mapping */ old = set_iommu_pte_present(pt_mfn, dfn_x(dfn), mfn_x(mfn), level, - (flags & IOMMUF_writable), - (flags & IOMMUF_readable)); + flags & IOMMUF_writable, + flags & IOMMUF_readable, &contig); + + while ( unlikely(contig) && ++level < hd->arch.amd.paging_mode ) + { + struct page_info *pg = mfn_to_page(_mfn(pt_mfn)); + unsigned long next_mfn; + + if ( iommu_pde_from_dfn(d, dfn_x(dfn), level, &pt_mfn, flush_flags, + false) ) + BUG(); + BUG_ON(!pt_mfn); + + next_mfn = mfn_x(mfn) & (~0UL << (PTE_PER_TABLE_SHIFT * (level - 1))); + set_iommu_pte_present(pt_mfn, dfn_x(dfn), next_mfn, level, + flags & IOMMUF_writable, + flags & IOMMUF_readable, &contig); + *flush_flags |= IOMMU_FLUSHF_modified | IOMMU_FLUSHF_all; + iommu_queue_free_pgtable(hd, pg); + } spin_unlock(&hd->arch.mapping_lock);
When a page table ends up with all contiguous entries (including all identical attributes), it can be replaced by a superpage entry at the next higher level. The page table itself can then be scheduled for freeing. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- Unlike the freeing of all-empty page tables, this causes quite a bit of back and forth for PV domains, due to their mapping/unmapping of pages when they get converted to/from being page tables. It may therefore be worth considering to delay re-coalescing a little, to avoid doing so when the superpage would otherwise get split again pretty soon. But I think this would better be the subject of a separate change anyway. Of course this could also be helped by more "aware" kernel side behavior: They could avoid immediately mapping freed page tables writable again, in anticipation of re-using that same page for another page table elsewhere. --- v4: Re-base over changes earlier in the series. v3: New.