Message ID | 20200309102304.1251-4-paul@xen.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | remove one more shared xenheap page: shared_info | expand |
On 09.03.2020 11:23, paul@xen.org wrote: > From: Paul Durrant <pdurrant@amazon.com> > > This patch modifies several places walking the domain's page_list to make > them ignore PGC_extra pages: > > - dump_pageframe_info() should ignore PGC_extra pages in its dump as it > determines whether to dump using domain_tot_pages() which also ignores > PGC_extra pages. This argument looks wrong to me: Let's take an example - a domain almost fully cleaned up, with 8 "normal" and 3 "extra" pages left. domain_tot_pages() returns 8 in this case, i.e. "normal" page dumping doesn't get skipped. However, there now won't be any trace of the "extra" pages, because they're also not on xenpage_list, which gets all its pages dumped in all cases. Correct restoration of original behavior would be to dump "normal" pages when there are less than 10, and to dump all "extra" pages. (Same of course goes for live domains, where "normal" page dumping would be skipped in the common case, but xenheap pages would be dumped, and hence so should be "extra" ones.) As indicated before, the removal of the APIC assist page from xenpage_list was already slightly regressing in this regard (as well as in at least one other way, I'm afraid), and you're now deliberately making the regression even bigger. Jan
> -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 09 March 2020 13:04 > To: paul@xen.org > Cc: xen-devel@lists.xenproject.org; Paul Durrant <pdurrant@amazon.com>; Andrew Cooper > <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com> > Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM > > On 09.03.2020 11:23, paul@xen.org wrote: > > From: Paul Durrant <pdurrant@amazon.com> > > > > This patch modifies several places walking the domain's page_list to make > > them ignore PGC_extra pages: > > > > - dump_pageframe_info() should ignore PGC_extra pages in its dump as it > > determines whether to dump using domain_tot_pages() which also ignores > > PGC_extra pages. > > This argument looks wrong to me: Let's take an example - a domain > almost fully cleaned up, with 8 "normal" and 3 "extra" pages left. > domain_tot_pages() returns 8 in this case, i.e. "normal" page > dumping doesn't get skipped. However, there now won't be any trace > of the "extra" pages, because they're also not on xenpage_list, > which gets all its pages dumped in all cases. Correct restoration > of original behavior would be to dump "normal" pages when there > are less than 10, and to dump all "extra" pages. (Same of course > goes for live domains, where "normal" page dumping would be > skipped in the common case, but xenheap pages would be dumped, and > hence so should be "extra" ones.) As indicated before, the removal > of the APIC assist page from xenpage_list was already slightly > regressing in this regard (as well as in at least one other way, > I'm afraid), and you're now deliberately making the regression > even bigger. I thought the idea here was that the domheap dump loop should be dumping 'normal' pages so it seems reasonable to me that the number of pages dumped to match the value returned by domain_tot_pages(). Would you perhaps be happier if we put 'extra' pages on separate page list, which can be unconditionally dumped so as I transition xenheap pages to 'extra' pages they don't get missed? It would also get rid of some of the other checks for PGC_extra that I have to introduce because they currently end up on the domain's page list. Paul > > Jan
On 10.03.2020 14:32, Paul Durrant wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> Sent: 09 March 2020 13:04 >> To: paul@xen.org >> Cc: xen-devel@lists.xenproject.org; Paul Durrant <pdurrant@amazon.com>; Andrew Cooper >> <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com> >> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM >> >> On 09.03.2020 11:23, paul@xen.org wrote: >>> From: Paul Durrant <pdurrant@amazon.com> >>> >>> This patch modifies several places walking the domain's page_list to make >>> them ignore PGC_extra pages: >>> >>> - dump_pageframe_info() should ignore PGC_extra pages in its dump as it >>> determines whether to dump using domain_tot_pages() which also ignores >>> PGC_extra pages. >> >> This argument looks wrong to me: Let's take an example - a domain >> almost fully cleaned up, with 8 "normal" and 3 "extra" pages left. >> domain_tot_pages() returns 8 in this case, i.e. "normal" page >> dumping doesn't get skipped. However, there now won't be any trace >> of the "extra" pages, because they're also not on xenpage_list, >> which gets all its pages dumped in all cases. Correct restoration >> of original behavior would be to dump "normal" pages when there >> are less than 10, and to dump all "extra" pages. (Same of course >> goes for live domains, where "normal" page dumping would be >> skipped in the common case, but xenheap pages would be dumped, and >> hence so should be "extra" ones.) As indicated before, the removal >> of the APIC assist page from xenpage_list was already slightly >> regressing in this regard (as well as in at least one other way, >> I'm afraid), and you're now deliberately making the regression >> even bigger. > > I thought the idea here was that the domheap dump loop should be > dumping 'normal' pages so it seems reasonable to me that the number > of pages dumped to match the value returned by domain_tot_pages(). I never thought of such a connection. To me the invocation of domain_tot_pages() there is there only to avoid overly much output. > Would you perhaps be happier if we put 'extra' pages on separate > page list, which can be unconditionally dumped so as I transition > xenheap pages to 'extra' pages they don't get missed? It would > also get rid of some of the other checks for PGC_extra that I > have to introduce because they currently end up on the domain's > page list. Hmm, wasn't it an intended side effect to have all pages on one list now? Introducing a 3rd list (even if just temporarily, until xenpage_list can be dropped) will be somewhat ugly because of how arch_free_heap_page() works. In reply to patch 6 I did suggest to have a separate list, but without taking these pages off d->page_list, such that here you would skip them in the main domain page dumping loop, but you would then traverse that second list and dump all of its elements, just like xenpage_list gets handled there. Jan
> -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 10 March 2020 14:59 > To: paul@xen.org > Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Andrew Cooper' > <andrew.cooper3@citrix.com>; 'Wei Liu' <wl@xen.org>; 'Roger Pau Monné' <roger.pau@citrix.com> > Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM > > On 10.03.2020 14:32, Paul Durrant wrote: > >> -----Original Message----- > >> From: Jan Beulich <jbeulich@suse.com> > >> Sent: 09 March 2020 13:04 > >> To: paul@xen.org > >> Cc: xen-devel@lists.xenproject.org; Paul Durrant <pdurrant@amazon.com>; Andrew Cooper > >> <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com> > >> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM > >> > >> On 09.03.2020 11:23, paul@xen.org wrote: > >>> From: Paul Durrant <pdurrant@amazon.com> > >>> > >>> This patch modifies several places walking the domain's page_list to make > >>> them ignore PGC_extra pages: > >>> > >>> - dump_pageframe_info() should ignore PGC_extra pages in its dump as it > >>> determines whether to dump using domain_tot_pages() which also ignores > >>> PGC_extra pages. > >> > >> This argument looks wrong to me: Let's take an example - a domain > >> almost fully cleaned up, with 8 "normal" and 3 "extra" pages left. > >> domain_tot_pages() returns 8 in this case, i.e. "normal" page > >> dumping doesn't get skipped. However, there now won't be any trace > >> of the "extra" pages, because they're also not on xenpage_list, > >> which gets all its pages dumped in all cases. Correct restoration > >> of original behavior would be to dump "normal" pages when there > >> are less than 10, and to dump all "extra" pages. (Same of course > >> goes for live domains, where "normal" page dumping would be > >> skipped in the common case, but xenheap pages would be dumped, and > >> hence so should be "extra" ones.) As indicated before, the removal > >> of the APIC assist page from xenpage_list was already slightly > >> regressing in this regard (as well as in at least one other way, > >> I'm afraid), and you're now deliberately making the regression > >> even bigger. > > > > I thought the idea here was that the domheap dump loop should be > > dumping 'normal' pages so it seems reasonable to me that the number > > of pages dumped to match the value returned by domain_tot_pages(). > > I never thought of such a connection. To me the invocation of > domain_tot_pages() there is there only to avoid overly much output. > > > Would you perhaps be happier if we put 'extra' pages on separate > > page list, which can be unconditionally dumped so as I transition > > xenheap pages to 'extra' pages they don't get missed? It would > > also get rid of some of the other checks for PGC_extra that I > > have to introduce because they currently end up on the domain's > > page list. > > Hmm, wasn't it an intended side effect to have all pages on one > list now? That would be nice, but I cannot reconcile that with unconditionally dumping all extra pages... otherwise dump_pageframe_info() would always have to walk the entire page_list no matter how long it was. > Introducing a 3rd list (even if just temporarily, until > xenpage_list can be dropped) will be somewhat ugly because of how > arch_free_heap_page() works. Yes, but it would at least be temporary. > In reply to patch 6 I did suggest to > have a separate list, but without taking these pages off > d->page_list, How would that work without adding an extra page_list_entry into struct page_info? > such that here you would skip them in the main > domain page dumping loop, but you would then traverse that second > list and dump all of its elements, just like xenpage_list gets > handled there. > Well, that's what I'm trying to achieve, yes. Paul
On 10.03.2020 16:05, Paul Durrant wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> Sent: 10 March 2020 14:59 >> >> In reply to patch 6 I did suggest to >> have a separate list, but without taking these pages off >> d->page_list, > > How would that work without adding an extra page_list_entry into struct page_info? As said there, it'd be a singly linked list using a __pdx_t field just like there already is with "next_shadow", i.e. you'd add another union member "next_extra" or some such. Of course the list shouldn't grow too long, or else insertion and removal may become a bottleneck. Not sure how well this would fit Arm, though; maybe they wouldn't need this, but that depends on whether the list would be used for purposes beyond dumping. Jan >> such that here you would skip them in the main >> domain page dumping loop, but you would then traverse that second >> list and dump all of its elements, just like xenpage_list gets >> handled there. >> > > Well, that's what I'm trying to achieve, yes. > > Paul >
> -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 10 March 2020 15:13 > To: paul@xen.org > Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Andrew Cooper' > <andrew.cooper3@citrix.com>; 'Wei Liu' <wl@xen.org>; 'Roger Pau Monné' <roger.pau@citrix.com> > Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM > > On 10.03.2020 16:05, Paul Durrant wrote: > >> -----Original Message----- > >> From: Jan Beulich <jbeulich@suse.com> > >> Sent: 10 March 2020 14:59 > >> > >> In reply to patch 6 I did suggest to > >> have a separate list, but without taking these pages off > >> d->page_list, > > > > How would that work without adding an extra page_list_entry into struct page_info? > > As said there, it'd be a singly linked list using a __pdx_t > field just like there already is with "next_shadow", i.e. > you'd add another union member "next_extra" or some such. Of > course the list shouldn't grow too long, or else insertion > and removal may become a bottleneck. Not sure how well this > would fit Arm, though; maybe they wouldn't need this, but > that depends on whether the list would be used for purposes > beyond dumping. > That seems more obscure and bug-prone than an extra list head in struct domain. I'd really prefer to stick with that even if it does make things a little more ugly until xenpage_list goes away. Paul
On 10.03.2020 16:16, Paul Durrant wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> Sent: 10 March 2020 15:13 >> To: paul@xen.org >> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Andrew Cooper' >> <andrew.cooper3@citrix.com>; 'Wei Liu' <wl@xen.org>; 'Roger Pau Monné' <roger.pau@citrix.com> >> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM >> >> On 10.03.2020 16:05, Paul Durrant wrote: >>>> -----Original Message----- >>>> From: Jan Beulich <jbeulich@suse.com> >>>> Sent: 10 March 2020 14:59 >>>> >>>> In reply to patch 6 I did suggest to >>>> have a separate list, but without taking these pages off >>>> d->page_list, >>> >>> How would that work without adding an extra page_list_entry into struct page_info? >> >> As said there, it'd be a singly linked list using a __pdx_t >> field just like there already is with "next_shadow", i.e. >> you'd add another union member "next_extra" or some such. Of >> course the list shouldn't grow too long, or else insertion >> and removal may become a bottleneck. Not sure how well this >> would fit Arm, though; maybe they wouldn't need this, but >> that depends on whether the list would be used for purposes >> beyond dumping. > > That seems more obscure and bug-prone than an extra list head > in struct domain. I'd really prefer to stick with that even > if it does make things a little more ugly until xenpage_list > goes away. Okay with me if there really was no property (other than assign_pages() then needing to pick the right list, which is not much different from needing to put the extra pages on two lists) that you'd lose by no longer having the pages on the same list. Jan
> -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 10 March 2020 15:33 > To: paul@xen.org > Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Andrew Cooper' > <andrew.cooper3@citrix.com>; 'Wei Liu' <wl@xen.org>; 'Roger Pau Monné' <roger.pau@citrix.com> > Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM > > On 10.03.2020 16:16, Paul Durrant wrote: > >> -----Original Message----- > >> From: Jan Beulich <jbeulich@suse.com> > >> Sent: 10 March 2020 15:13 > >> To: paul@xen.org > >> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Andrew Cooper' > >> <andrew.cooper3@citrix.com>; 'Wei Liu' <wl@xen.org>; 'Roger Pau Monné' <roger.pau@citrix.com> > >> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM > >> > >> On 10.03.2020 16:05, Paul Durrant wrote: > >>>> -----Original Message----- > >>>> From: Jan Beulich <jbeulich@suse.com> > >>>> Sent: 10 March 2020 14:59 > >>>> > >>>> In reply to patch 6 I did suggest to > >>>> have a separate list, but without taking these pages off > >>>> d->page_list, > >>> > >>> How would that work without adding an extra page_list_entry into struct page_info? > >> > >> As said there, it'd be a singly linked list using a __pdx_t > >> field just like there already is with "next_shadow", i.e. > >> you'd add another union member "next_extra" or some such. Of > >> course the list shouldn't grow too long, or else insertion > >> and removal may become a bottleneck. Not sure how well this > >> would fit Arm, though; maybe they wouldn't need this, but > >> that depends on whether the list would be used for purposes > >> beyond dumping. > > > > That seems more obscure and bug-prone than an extra list head > > in struct domain. I'd really prefer to stick with that even > > if it does make things a little more ugly until xenpage_list > > goes away. > > Okay with me if there really was no property (other than > assign_pages() then needing to pick the right list, which is > not much different from needing to put the extra pages on two > lists) that you'd lose by no longer having the pages on the > same list. Just assign_pages() and arch_free_heap_page(), and my test patch defines: +static inline struct page_list_head *page_to_list( + struct domain *d, const struct page_info *pg) +{ + if ( is_xen_heap_page(pg) ) + return &d->xenpage_list; + else if ( pg->count_info & PGC_extra ) + return &d->extra_page_list; + + return &d->page_list; +} which they both use to select the right list. Once xenpage_list goes away then this can be simplified to use a ternary operator. Paul > > Jan
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index bdcc0d972a..f6ed25e8ee 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -231,6 +231,9 @@ void dump_pageframe_info(struct domain *d) unsigned int index = MASK_EXTR(page->u.inuse.type_info, PGT_type_mask); + if ( page->count_info & PGC_extra ) + continue; + if ( ++total[index] > 16 ) { switch ( page->u.inuse.type_info & PGT_type_mask ) @@ -1044,7 +1047,8 @@ int arch_set_info_guest( { struct page_info *page = page_list_remove_head(&d->page_list); - if ( page_lock(page) ) + if ( !(page->count_info & PGC_extra) && + page_lock(page) ) { if ( (page->u.inuse.type_info & PGT_type_mask) == PGT_l4_page_table ) diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index 9f51370327..71d2fb9bbc 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -2843,6 +2843,9 @@ void audit_p2m(struct domain *d, spin_lock(&d->page_alloc_lock); page_list_for_each ( page, &d->page_list ) { + if ( page->count_info & PGC_extra ) + continue; + mfn = mfn_x(page_to_mfn(page)); P2M_PRINTK("auditing guest page, mfn=%#lx\n", mfn); diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c index f1066c59c7..7e5aa8dc95 100644 --- a/xen/arch/x86/numa.c +++ b/xen/arch/x86/numa.c @@ -428,6 +428,9 @@ static void dump_numa(unsigned char key) spin_lock(&d->page_alloc_lock); page_list_for_each(page, &d->page_list) { + if ( page->count_info & PGC_extra ) + break; + i = phys_to_nid(page_to_maddr(page)); page_num_node[i]++; } diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c index dc16ef2e79..f8f1bbe2f4 100644 --- a/xen/arch/x86/pv/dom0_build.c +++ b/xen/arch/x86/pv/dom0_build.c @@ -792,6 +792,10 @@ int __init dom0_construct_pv(struct domain *d, { mfn = mfn_x(page_to_mfn(page)); BUG_ON(SHARED_M2P(get_gpfn_from_mfn(mfn))); + + if ( page->count_info & PGC_extra ) + continue; + if ( get_gpfn_from_mfn(mfn) >= count ) { BUG_ON(is_pv_32bit_domain(d)); diff --git a/xen/arch/x86/tboot.c b/xen/arch/x86/tboot.c index 8c232270b4..6cc020cb71 100644 --- a/xen/arch/x86/tboot.c +++ b/xen/arch/x86/tboot.c @@ -220,7 +220,12 @@ static void tboot_gen_domain_integrity(const uint8_t key[TB_KEY_SIZE], spin_lock(&d->page_alloc_lock); page_list_for_each(page, &d->page_list) { - void *pg = __map_domain_page(page); + void *pg; + + if ( page->count_info & PGC_extra ) + continue; + + pg = __map_domain_page(page); vmac_update(pg, PAGE_SIZE, &ctx); unmap_domain_page(pg); }