Message ID | 1473139058-32003-1-git-send-email-dongli.zhang@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Sep 06, 2016 at 01:17:38PM +0800, Dongli Zhang wrote: > This patch implemented parts of TODO left in commit id > a902c12ee45fc9389eb8fe54eeddaf267a555c58. It moved TLB-flush filtering out > into populate_physmap. > > Because of TLB-flush in alloc_heap_pages, it's very slow to create a guest > with memory size of more than 100GB on host with 100+ cpus. > > This patch introduced a "MEMF_no_tlbflush" bit to memflag to indicate > whether TLB-flush should be done in alloc_heap_pages or its caller > populate_physmap. Once this bit is set in memflag, alloc_heap_pages will > ignore TLB-flush. > > Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> > --- > xen/common/memory.c | 26 ++++++++++++++++++++++++++ > xen/common/page_alloc.c | 3 ++- > xen/include/xen/mm.h | 2 ++ > 3 files changed, 30 insertions(+), 1 deletion(-) > > diff --git a/xen/common/memory.c b/xen/common/memory.c > index f34dd56..14ec5fa 100644 > --- a/xen/common/memory.c > +++ b/xen/common/memory.c > @@ -141,6 +141,8 @@ static void populate_physmap(struct memop_args *a) > unsigned int i, j; > xen_pfn_t gpfn, mfn; > struct domain *d = a->domain, *curr_d = current->domain; > + bool_t need_tlbflush = 0; > + uint32_t tlbflush_timestamp = 0; > > if ( !guest_handle_subrange_okay(a->extent_list, a->nr_done, > a->nr_extents-1) ) > @@ -149,6 +151,8 @@ static void populate_physmap(struct memop_args *a) > if ( a->extent_order > (a->memflags & MEMF_populate_on_demand ? MAX_ORDER : > max_order(curr_d)) ) > return; > + > + a->memflags |= MEMF_no_tlbflush; > > for ( i = a->nr_done; i < a->nr_extents; i++ ) > { > @@ -213,6 +217,18 @@ static void populate_physmap(struct memop_args *a) > i, a->nr_extents); > goto out; > } > + > + for ( j = 0; j < (1U << a->extent_order); j++ ) > + { > + if ( page[j].u.free.need_tlbflush && > + (page[j].tlbflush_timestamp <= tlbflush_current_time()) && > + (!need_tlbflush || > + (page[j].tlbflush_timestamp > tlbflush_timestamp)) ) > + { > + need_tlbflush = 1; > + tlbflush_timestamp = page[j].tlbflush_timestamp; > + } > + } > Indentation seems rather wrong. Please configure your editor properly. Wei.
diff --git a/xen/common/memory.c b/xen/common/memory.c index f34dd56..14ec5fa 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -141,6 +141,8 @@ static void populate_physmap(struct memop_args *a) unsigned int i, j; xen_pfn_t gpfn, mfn; struct domain *d = a->domain, *curr_d = current->domain; + bool_t need_tlbflush = 0; + uint32_t tlbflush_timestamp = 0; if ( !guest_handle_subrange_okay(a->extent_list, a->nr_done, a->nr_extents-1) ) @@ -149,6 +151,8 @@ static void populate_physmap(struct memop_args *a) if ( a->extent_order > (a->memflags & MEMF_populate_on_demand ? MAX_ORDER : max_order(curr_d)) ) return; + + a->memflags |= MEMF_no_tlbflush; for ( i = a->nr_done; i < a->nr_extents; i++ ) { @@ -213,6 +217,18 @@ static void populate_physmap(struct memop_args *a) i, a->nr_extents); goto out; } + + for ( j = 0; j < (1U << a->extent_order); j++ ) + { + if ( page[j].u.free.need_tlbflush && + (page[j].tlbflush_timestamp <= tlbflush_current_time()) && + (!need_tlbflush || + (page[j].tlbflush_timestamp > tlbflush_timestamp)) ) + { + need_tlbflush = 1; + tlbflush_timestamp = page[j].tlbflush_timestamp; + } + } mfn = page_to_mfn(page); } @@ -232,6 +248,16 @@ static void populate_physmap(struct memop_args *a) } out: + if ( need_tlbflush ) + { + cpumask_t mask = cpu_online_map; + tlbflush_filter(mask, tlbflush_timestamp); + if ( !cpumask_empty(&mask) ) + { + perfc_incr(need_flush_tlb_flush); + flush_tlb_mask(&mask); + } + } a->nr_done = i; } diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 18ff6cf..79f633b 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -827,7 +827,8 @@ static struct page_info *alloc_heap_pages( BUG_ON(pg[i].count_info != PGC_state_free); pg[i].count_info = PGC_state_inuse; - if ( pg[i].u.free.need_tlbflush && + if ( !(memflags & MEMF_no_tlbflush) && + pg[i].u.free.need_tlbflush && (pg[i].tlbflush_timestamp <= tlbflush_current_time()) && (!need_tlbflush || (pg[i].tlbflush_timestamp > tlbflush_timestamp)) ) diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index 58bc0b8..880ca88 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -221,6 +221,8 @@ struct npfec { #define MEMF_exact_node (1U<<_MEMF_exact_node) #define _MEMF_no_owner 5 #define MEMF_no_owner (1U<<_MEMF_no_owner) +#define _MEMF_no_tlbflush 6 +#define MEMF_no_tlbflush (1U<<_MEMF_no_tlbflush) #define _MEMF_node 8 #define MEMF_node_mask ((1U << (8 * sizeof(nodeid_t))) - 1) #define MEMF_node(n) ((((n) + 1) & MEMF_node_mask) << _MEMF_node)
This patch implemented parts of TODO left in commit id a902c12ee45fc9389eb8fe54eeddaf267a555c58. It moved TLB-flush filtering out into populate_physmap. Because of TLB-flush in alloc_heap_pages, it's very slow to create a guest with memory size of more than 100GB on host with 100+ cpus. This patch introduced a "MEMF_no_tlbflush" bit to memflag to indicate whether TLB-flush should be done in alloc_heap_pages or its caller populate_physmap. Once this bit is set in memflag, alloc_heap_pages will ignore TLB-flush. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> --- xen/common/memory.c | 26 ++++++++++++++++++++++++++ xen/common/page_alloc.c | 3 ++- xen/include/xen/mm.h | 2 ++ 3 files changed, 30 insertions(+), 1 deletion(-)