Message ID | 1491382790-30066-1-git-send-email-yu.c.zhang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 05/04/17 09:59, Yu Zhang wrote: > Previously, p2m_finish_type_change() is triggered to iterate and > clean up the p2m table when an ioreq server unmaps from memory type > HVMMEM_ioreq_server. And the current iteration number is set to 256 > And after these iterations, hypercall pre-emption is checked. > > But it is likely that no p2m change is performed for the just finished > iterations, which means p2m_finish_type_change() will return quite > soon. So in such scenario, we can allow the p2m iteration to continue, > without checking the hypercall pre-emption. Suppose you have a guest with 128TiB of RAM, and the ioreq_server p2m entries are at the very end of RAM. Won't this run for several minutes before even allowing preemption? -George
On 05/04/17 16:10, George Dunlap wrote: > On 05/04/17 09:59, Yu Zhang wrote: >> Previously, p2m_finish_type_change() is triggered to iterate and >> clean up the p2m table when an ioreq server unmaps from memory type >> HVMMEM_ioreq_server. And the current iteration number is set to 256 >> And after these iterations, hypercall pre-emption is checked. >> >> But it is likely that no p2m change is performed for the just finished >> iterations, which means p2m_finish_type_change() will return quite >> soon. So in such scenario, we can allow the p2m iteration to continue, >> without checking the hypercall pre-emption. > > Suppose you have a guest with 128TiB of RAM, and the ioreq_server p2m > entries are at the very end of RAM. Won't this run for several minutes > before even allowing preemption? Sorry, this should be GiB. But I think you get my point. :-) -George
On 4/5/2017 11:11 PM, George Dunlap wrote: > On 05/04/17 16:10, George Dunlap wrote: >> On 05/04/17 09:59, Yu Zhang wrote: >>> Previously, p2m_finish_type_change() is triggered to iterate and >>> clean up the p2m table when an ioreq server unmaps from memory type >>> HVMMEM_ioreq_server. And the current iteration number is set to 256 >>> And after these iterations, hypercall pre-emption is checked. >>> >>> But it is likely that no p2m change is performed for the just finished >>> iterations, which means p2m_finish_type_change() will return quite >>> soon. So in such scenario, we can allow the p2m iteration to continue, >>> without checking the hypercall pre-emption. >> Suppose you have a guest with 128TiB of RAM, and the ioreq_server p2m >> entries are at the very end of RAM. Won't this run for several minutes >> before even allowing preemption? > Sorry, this should be GiB. But I think you get my point. :-) Yep. I got it. I'd better reconsider it - my head is quite dizzy now. Maybe together with your generic p2m change solution in 4.10. :-) Thanks Yu > > -George > >
>>> On 05.04.17 at 10:59, <yu.c.zhang@linux.intel.com> wrote: > --- a/xen/arch/x86/hvm/dm.c > +++ b/xen/arch/x86/hvm/dm.c > @@ -411,14 +411,17 @@ static int dm_op(domid_t domid, > while ( read_atomic(&p2m->ioreq.entry_count) && > first_gfn <= p2m->max_mapped_pfn ) > { > + bool changed = false; > + > /* Iterate p2m table for 256 gfns each time. */ > p2m_finish_type_change(d, _gfn(first_gfn), 256, > - p2m_ioreq_server, p2m_ram_rw); > + p2m_ioreq_server, p2m_ram_rw, &changed); > > first_gfn += 256; > > /* Check for continuation if it's not the last iteration. */ > if ( first_gfn <= p2m->max_mapped_pfn && > + changed && > hypercall_preempt_check() ) > { > rc = -ERESTART; I appreciate and support the intention, but you're opening up a long lasting loop here in case very little or no changes need to be done. You need to check for preemption every so many iterations even if you've never seen "changed" come back set. Jan
>>> On 05.04.17 at 10:59, <yu.c.zhang@linux.intel.com> wrote: > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -1034,12 +1034,13 @@ void p2m_change_type_range(struct domain *d, > /* Synchronously modify the p2m type for a range of gfns from ot to nt. */ > void p2m_finish_type_change(struct domain *d, > gfn_t first_gfn, unsigned long max_nr, > - p2m_type_t ot, p2m_type_t nt) > + p2m_type_t ot, p2m_type_t nt, bool *changed) > { > struct p2m_domain *p2m = p2m_get_hostp2m(d); > p2m_type_t t; > unsigned long gfn = gfn_x(first_gfn); > unsigned long last_gfn = gfn + max_nr - 1; > + bool is_changed = false; > > ASSERT(ot != nt); > ASSERT(p2m_is_changeable(ot) && p2m_is_changeable(nt)); > @@ -1052,12 +1053,18 @@ void p2m_finish_type_change(struct domain *d, > get_gfn_query_unlocked(d, gfn, &t); > > if ( t == ot ) > + { > p2m_change_type_one(d, gfn, t, nt); > + is_changed = true; > + } > > gfn++; > } > > p2m_unlock(p2m); > + > + if ( changed ) > + *changed = is_changed; > } Also, wouldn't it be better to return a count here? If there was just a single change in the current 256-GFN batch, surely we could take on another? Jan
On 5/10/2017 12:29 AM, Jan Beulich wrote: >>>> On 05.04.17 at 10:59, <yu.c.zhang@linux.intel.com> wrote: >> --- a/xen/arch/x86/hvm/dm.c >> +++ b/xen/arch/x86/hvm/dm.c >> @@ -411,14 +411,17 @@ static int dm_op(domid_t domid, >> while ( read_atomic(&p2m->ioreq.entry_count) && >> first_gfn <= p2m->max_mapped_pfn ) >> { >> + bool changed = false; >> + >> /* Iterate p2m table for 256 gfns each time. */ >> p2m_finish_type_change(d, _gfn(first_gfn), 256, >> - p2m_ioreq_server, p2m_ram_rw); >> + p2m_ioreq_server, p2m_ram_rw, &changed); >> >> first_gfn += 256; >> >> /* Check for continuation if it's not the last iteration. */ >> if ( first_gfn <= p2m->max_mapped_pfn && >> + changed && >> hypercall_preempt_check() ) >> { >> rc = -ERESTART; > I appreciate and support the intention, but you're opening up a > long lasting loop here in case very little or no changes need to > be done. You need to check for preemption every so many > iterations even if you've never seen "changed" come back set. Thanks for your comments, Jan. Indeed, this patch is problematic. Another thought is - since current p2m sweeping implementation disables live migration when there's ioreq server entries left, and George had proposed a generic p2m change solution previously. I'd like to leave the optimization together with the generic solution in future xen release. :-) Yu > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c index d72b7bd..3aa1286 100644 --- a/xen/arch/x86/hvm/dm.c +++ b/xen/arch/x86/hvm/dm.c @@ -411,14 +411,17 @@ static int dm_op(domid_t domid, while ( read_atomic(&p2m->ioreq.entry_count) && first_gfn <= p2m->max_mapped_pfn ) { + bool changed = false; + /* Iterate p2m table for 256 gfns each time. */ p2m_finish_type_change(d, _gfn(first_gfn), 256, - p2m_ioreq_server, p2m_ram_rw); + p2m_ioreq_server, p2m_ram_rw, &changed); first_gfn += 256; /* Check for continuation if it's not the last iteration. */ if ( first_gfn <= p2m->max_mapped_pfn && + changed && hypercall_preempt_check() ) { rc = -ERESTART; diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index 0daaa86..b171a4b 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1034,12 +1034,13 @@ void p2m_change_type_range(struct domain *d, /* Synchronously modify the p2m type for a range of gfns from ot to nt. */ void p2m_finish_type_change(struct domain *d, gfn_t first_gfn, unsigned long max_nr, - p2m_type_t ot, p2m_type_t nt) + p2m_type_t ot, p2m_type_t nt, bool *changed) { struct p2m_domain *p2m = p2m_get_hostp2m(d); p2m_type_t t; unsigned long gfn = gfn_x(first_gfn); unsigned long last_gfn = gfn + max_nr - 1; + bool is_changed = false; ASSERT(ot != nt); ASSERT(p2m_is_changeable(ot) && p2m_is_changeable(nt)); @@ -1052,12 +1053,18 @@ void p2m_finish_type_change(struct domain *d, get_gfn_query_unlocked(d, gfn, &t); if ( t == ot ) + { p2m_change_type_one(d, gfn, t, nt); + is_changed = true; + } gfn++; } p2m_unlock(p2m); + + if ( changed ) + *changed = is_changed; } /* diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index 0e670af..2e02538 100644 --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -615,7 +615,7 @@ int p2m_change_type_one(struct domain *d, unsigned long gfn, void p2m_finish_type_change(struct domain *d, gfn_t first_gfn, unsigned long max_nr, - p2m_type_t ot, p2m_type_t nt); + p2m_type_t ot, p2m_type_t nt, bool *changed); /* Report a change affecting memory types. */ void p2m_memory_type_changed(struct domain *d);
Previously, p2m_finish_type_change() is triggered to iterate and clean up the p2m table when an ioreq server unmaps from memory type HVMMEM_ioreq_server. And the current iteration number is set to 256 And after these iterations, hypercall pre-emption is checked. But it is likely that no p2m change is performed for the just finished iterations, which means p2m_finish_type_change() will return quite soon. So in such scenario, we can allow the p2m iteration to continue, without checking the hypercall pre-emption. Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com> --- Note: this patch shall only be accepted after previous x86/ioreq server patches be merged. Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: George Dunlap <george.dunlap@eu.citrix.com> --- xen/arch/x86/hvm/dm.c | 5 ++++- xen/arch/x86/mm/p2m.c | 9 ++++++++- xen/include/asm-x86/p2m.h | 2 +- 3 files changed, 13 insertions(+), 3 deletions(-)