Message ID | 3375af1e708b4ec3205f493a17da6e0369249096.1575891620.git.hongyxia@amazon.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Refactor super page shattering | expand |
On 09.12.2019 12:48, Hongyan Xia wrote: > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -5151,6 +5151,51 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v) > flush_area_local((const void *)v, f) : \ > flush_area_all((const void *)v, f)) > > +/* Shatter an l3 entry and populate l2. If virt is passed in, also do flush. */ > +static int shatter_l3e(l3_pgentry_t *pl3e, unsigned long virt, bool locking) > +{ > + unsigned int i; > + l3_pgentry_t ol3e; > + l2_pgentry_t ol2e, *l2t = alloc_xen_pagetable(); > + > + if ( l2t == NULL ) Nowadays we seem to prefer !l2t in cases like this one. > + return -1; -ENOMEM please (and then handed on by the caller). > + ol3e = *pl3e; This could be the variable's initializer. > + ol2e = l2e_from_intpte(l3e_get_intpte(ol3e)); There's nothing "old" about this L2 entry, so its name would better be just "l2e" I think. > + for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ ) > + { > + l2e_write(l2t + i, ol2e); > + ol2e = l2e_from_intpte( > + l2e_get_intpte(ol2e) + (1 << (PAGETABLE_ORDER + PAGE_SHIFT))); Indentation looks odd here (also further down). If the first argument of a function call doesn't fit on the line and would also be ugly to split across lines, what we do is indent it the usual 4 characters from the function invocation, i.e. in this case ol2e = l2e_from_intpte( l2e_get_intpte(ol2e) + (1 << (PAGETABLE_ORDER + PAGE_SHIFT))); and then slightly shorter ol2e = l2e_from_intpte( l2e_get_intpte(ol2e) + (PAGE_SIZE << PAGETABLE_ORDER)); Of course, as mentioned before, I'm not overly happy to see type safety lost in case like this one, where it's not needed like e.g. further up to convert from L3 to L2 entry. > + } > + if ( locking ) > + spin_lock(&map_pgdir_lock); > + if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) && > + (l3e_get_flags(*pl3e) & _PAGE_PSE) ) > + { > + l3e_write_atomic(pl3e, > + l3e_from_paddr((paddr_t)virt_to_maddr(l2t), __PAGE_HYPERVISOR)); > + l2t = NULL; > + } > + if ( locking ) > + spin_unlock(&map_pgdir_lock); > + if ( virt ) > + { > + unsigned int flush_flags = > + FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER); > + > + if ( (l3e_get_flags(ol3e) & _PAGE_GLOBAL) ) Unnecessary pair of parentheses (which also wasn't there in the original code). > + flush_flags |= FLUSH_TLB_GLOBAL; Too deep indentation. > + flush_area(virt, flush_flags); > + } > + if ( l2t ) > + free_xen_pagetable(l2t); > + > + return 0; > +} Also please add blank lines between - L2 population and lock acquire, - lock release and TLB flush, - TLB flush and free. Jan
On Tue, 2019-12-10 at 16:20 +0100, Jan Beulich wrote: > > ol2e = l2e_from_intpte( > l2e_get_intpte(ol2e) + (PAGE_SIZE << > PAGETABLE_ORDER)); > > Of course, as mentioned before, I'm not overly happy to see type > safety lost in case like this one, where it's not needed like e.g. > further up to convert from L3 to L2 entry. > Okay, so I did a comparison between the efficiency of the assembly under a release build. The old "type-safe" way requires 16 instructions to prepare the first l2e, and each iteration of the inner loop of populating l2t requires 7 instructions. The new type-unsafe way requires 6 to prepare the first l2e, and each iteration of populating l2t takes 5 instructions. So the difference of populating l2t is 3600 vs. 2566 instructions, which is not very small. I have not tested the packed bit field way you suggested, but I think it could even be higher than 3600 due to masking, shifting and also overflow handling. Hongyan
On Tue, 2019-12-10 at 16:20 +0100, Jan Beulich wrote: > On 09.12.2019 12:48, Hongyan Xia wrote: > > --- a/xen/arch/x86/mm.c > > +++ b/xen/arch/x86/mm.c > > @@ -5151,6 +5151,51 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long > > v) > > flush_area_local((const void *)v, f) : \ > > flush_area_all((const void *)v, f)) > > > > +/* Shatter an l3 entry and populate l2. If virt is passed in, also > > do flush. */ > > +static int shatter_l3e(l3_pgentry_t *pl3e, unsigned long virt, > > bool locking) > > +{ > > + unsigned int i; > > + l3_pgentry_t ol3e; > > + l2_pgentry_t ol2e, *l2t = alloc_xen_pagetable(); > > + > > + if ( l2t == NULL ) > > Nowadays we seem to prefer !l2t in cases like this one. > > > + return -1; > > -ENOMEM please (and then handed on by the caller). > > > + ol3e = *pl3e; > > This could be the variable's initializer. > > > + ol2e = l2e_from_intpte(l3e_get_intpte(ol3e)); > > There's nothing "old" about this L2 entry, so its name would better > be just "l2e" I think. > > > + for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ ) > > + { > > + l2e_write(l2t + i, ol2e); > > + ol2e = l2e_from_intpte( > > + l2e_get_intpte(ol2e) + (1 << (PAGETABLE_ORDER + > > PAGE_SHIFT))); > > Indentation looks odd here (also further down). If the first argument > of a function call doesn't fit on the line and would also be ugly to > split across lines, what we do is indent it the usual 4 characters > from the function invocation, i.e. in this case > > ol2e = l2e_from_intpte( > l2e_get_intpte(ol2e) + (1 << (PAGETABLE_ORDER + > PAGE_SHIFT))); > > and then slightly shorter > > ol2e = l2e_from_intpte( > l2e_get_intpte(ol2e) + (PAGE_SIZE << > PAGETABLE_ORDER)); > > Of course, as mentioned before, I'm not overly happy to see type > safety lost in case like this one, where it's not needed like e.g. > further up to convert from L3 to L2 entry. > > > + } > > + if ( locking ) > > + spin_lock(&map_pgdir_lock); > > + if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) && > > + (l3e_get_flags(*pl3e) & _PAGE_PSE) ) > > + { > > + l3e_write_atomic(pl3e, > > + l3e_from_paddr((paddr_t)virt_to_maddr(l2t), > > __PAGE_HYPERVISOR)); > > + l2t = NULL; > > + } > > + if ( locking ) > > + spin_unlock(&map_pgdir_lock); > > + if ( virt ) > > + { > > + unsigned int flush_flags = > > + FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER); > > + > > + if ( (l3e_get_flags(ol3e) & _PAGE_GLOBAL) ) > > Unnecessary pair of parentheses (which also wasn't there in the > original code). > > > + flush_flags |= FLUSH_TLB_GLOBAL; > > Too deep indentation. > > > + flush_area(virt, flush_flags); > > + } > > + if ( l2t ) > > + free_xen_pagetable(l2t); > > + > > + return 0; > > +} > > Also please add blank lines between > - L2 population and lock acquire, > - lock release and TLB flush, > - TLB flush and free. > > Jan Issues fixed in v3. I have not touched the type safety part. If we think this is really important we can revert to what it was before, although from the quick study I did in my previous email, there is a performance difference. Hongyan
Hi Hongyan, On 11/12/2019 10:28, Xia, Hongyan wrote: > On Tue, 2019-12-10 at 16:20 +0100, Jan Beulich wrote: >> >> ol2e = l2e_from_intpte( >> l2e_get_intpte(ol2e) + (PAGE_SIZE << >> PAGETABLE_ORDER)); >> >> Of course, as mentioned before, I'm not overly happy to see type >> safety lost in case like this one, where it's not needed like e.g. >> further up to convert from L3 to L2 entry. >> > > Okay, so I did a comparison between the efficiency of the assembly > under a release build. > > The old "type-safe" way requires 16 instructions to prepare the first > l2e, and each iteration of the inner loop of populating l2t requires 7 > instructions. > > The new type-unsafe way requires 6 to prepare the first l2e, and each > iteration of populating l2t takes 5 instructions. > > So the difference of populating l2t is 3600 vs. 2566 instructions, > which is not very small. While this involves more instructions, how often do we expect the code to be called? Cheers,
On Wed, 2019-12-11 at 11:10 +0000, Julien Grall wrote: > Hi Hongyan, > ... > > While this involves more instructions, how often do we expect the > code > to be called? > > Cheers, > I don't expect this to be called very often in the current Xen. Although with direct map removal, a lot of the memory allocations (mostly xenheap allocations) will be mapped and unmapped on-demand and there is a much higher change of merging/shattering. However, the series moved all PTEs from xenheap to domheap, and we might see other things moved to domheap in the future, so we might not have many things left on xenheap anyway. Hongyan
Hi Hongyan, On 11/12/2019 11:28, Xia, Hongyan wrote: > On Wed, 2019-12-11 at 11:10 +0000, Julien Grall wrote: >> Hi Hongyan, >> ... >> >> While this involves more instructions, how often do we expect the >> code >> to be called? >> >> Cheers, >> > > I don't expect this to be called very often in the current Xen. > Although with direct map removal, a lot of the memory allocations > (mostly xenheap allocations) will be mapped and unmapped on-demand and > there is a much higher change of merging/shattering. Thank you for the explanation. In order to merge/shatter, you need the buddy allocator to ensure all the xenheap are allocated contiguously. I am pretty unconvinced there are a lot of page allocated via xenheap. So the chance to have contiguous xenheap allocation is very limited. But the merging/shattering can be counterproductive. An example short memory allocation (we have a few places like that): xmalloc(...); do something xfree(...); We would end up to merge and then a few ms later shatter again. So it feels to me, that merging is probably not worth it (I am planning to discuss with Andrew today about it). > > However, the series moved all PTEs from xenheap to domheap, and we > might see other things moved to domheap in the future, so we might not > have many things left on xenheap anyway. Typesafe is an important part of making our code base more secure than basic C (such as not mixing type). In this case, I think if we start to merge/shatter a lot, then we have a bigger problem and we may want to consider to remove it (see above) So it feels the optimization is not worth it. Note that I am not maintaining this code, so the final call is on Andrew and Jan. Cheers,
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 7d4dd80a85..6188a968ff 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5151,6 +5151,51 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v) flush_area_local((const void *)v, f) : \ flush_area_all((const void *)v, f)) +/* Shatter an l3 entry and populate l2. If virt is passed in, also do flush. */ +static int shatter_l3e(l3_pgentry_t *pl3e, unsigned long virt, bool locking) +{ + unsigned int i; + l3_pgentry_t ol3e; + l2_pgentry_t ol2e, *l2t = alloc_xen_pagetable(); + + if ( l2t == NULL ) + return -1; + + ol3e = *pl3e; + ol2e = l2e_from_intpte(l3e_get_intpte(ol3e)); + + for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ ) + { + l2e_write(l2t + i, ol2e); + ol2e = l2e_from_intpte( + l2e_get_intpte(ol2e) + (1 << (PAGETABLE_ORDER + PAGE_SHIFT))); + } + if ( locking ) + spin_lock(&map_pgdir_lock); + if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) && + (l3e_get_flags(*pl3e) & _PAGE_PSE) ) + { + l3e_write_atomic(pl3e, + l3e_from_paddr((paddr_t)virt_to_maddr(l2t), __PAGE_HYPERVISOR)); + l2t = NULL; + } + if ( locking ) + spin_unlock(&map_pgdir_lock); + if ( virt ) + { + unsigned int flush_flags = + FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER); + + if ( (l3e_get_flags(ol3e) & _PAGE_GLOBAL) ) + flush_flags |= FLUSH_TLB_GLOBAL; + flush_area(virt, flush_flags); + } + if ( l2t ) + free_xen_pagetable(l2t); + + return 0; +} + int map_pages_to_xen( unsigned long virt, mfn_t mfn, @@ -5244,9 +5289,6 @@ int map_pages_to_xen( if ( (l3e_get_flags(ol3e) & _PAGE_PRESENT) && (l3e_get_flags(ol3e) & _PAGE_PSE) ) { - unsigned int flush_flags = - FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER); - /* Skip this PTE if there is no change. */ if ( ((l3e_get_pfn(ol3e) & ~(L2_PAGETABLE_ENTRIES * L1_PAGETABLE_ENTRIES - 1)) + @@ -5267,33 +5309,9 @@ int map_pages_to_xen( continue; } - pl2e = alloc_xen_pagetable(); - if ( pl2e == NULL ) + /* Pass virt to indicate we need to flush. */ + if ( shatter_l3e(pl3e, virt, locking) ) return -ENOMEM; - - for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ ) - l2e_write(pl2e + i, - l2e_from_pfn(l3e_get_pfn(ol3e) + - (i << PAGETABLE_ORDER), - l3e_get_flags(ol3e))); - - if ( l3e_get_flags(ol3e) & _PAGE_GLOBAL ) - flush_flags |= FLUSH_TLB_GLOBAL; - - if ( locking ) - spin_lock(&map_pgdir_lock); - if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) && - (l3e_get_flags(*pl3e) & _PAGE_PSE) ) - { - l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(pl2e), - __PAGE_HYPERVISOR)); - pl2e = NULL; - } - if ( locking ) - spin_unlock(&map_pgdir_lock); - flush_area(virt, flush_flags); - if ( pl2e ) - free_xen_pagetable(pl2e); } pl2e = virt_to_xen_l2e(virt); @@ -5578,27 +5596,8 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) } /* PAGE1GB: shatter the superpage and fall through. */ - pl2e = alloc_xen_pagetable(); - if ( !pl2e ) + if ( shatter_l3e(pl3e, 0, locking) ) return -ENOMEM; - for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ ) - l2e_write(pl2e + i, - l2e_from_pfn(l3e_get_pfn(*pl3e) + - (i << PAGETABLE_ORDER), - l3e_get_flags(*pl3e))); - if ( locking ) - spin_lock(&map_pgdir_lock); - if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) && - (l3e_get_flags(*pl3e) & _PAGE_PSE) ) - { - l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(pl2e), - __PAGE_HYPERVISOR)); - pl2e = NULL; - } - if ( locking ) - spin_unlock(&map_pgdir_lock); - if ( pl2e ) - free_xen_pagetable(pl2e); } /*
map_pages_to_xen and modify_xen_mappings are performing almost exactly the same operations when shattering an l3 PTE, the only difference being whether we want to flush. Signed-off-by: Hongyan Xia <hongyxia@amazon.com> --- Changes in v2: - improve asm. - re-read pl3e from memory when taking the lock. - move the allocation of l2t inside the shatter function. --- xen/arch/x86/mm.c | 97 +++++++++++++++++++++++------------------------ 1 file changed, 48 insertions(+), 49 deletions(-)