Message ID | 20200309102304.1251-7-paul@xen.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | remove one more shared xenheap page: shared_info | expand |
On 09.03.2020 11:23, paul@xen.org wrote: > From: Paul Durrant <pdurrant@amazon.com> > > Currently shared_info is a shared xenheap page but shared xenheap pages > complicate future plans for live-update of Xen so it is desirable to, > where possible, not use them [1]. This patch therefore converts shared_info > into a PGC_extra domheap page. This does entail freeing shared_info during > domain_relinquish_resources() rather than domain_destroy() so care is > needed to avoid de-referencing a NULL shared_info pointer hence some > extra checks of 'is_dying' are needed. > > Also, because shared_info will no longer be a xenheap page this patch adds > an extra dump function to make sure the shared_info MFN is included in the > output of dump_pageframe_info(). I've voiced my objection to such a model, and hence it'll take another REST maintainer to approve of this despite my arguments against it. (And of course, just to re-record this here, the APIC access page, converted by ea3daabff5f2, will want to get a dumping function added then, too.) I wonder whether a domain's "extra" pages couldn't be put in a separate, singly-linked list, using the union the next_shadow field is in as the linking field. None of the other union members look to be applicable to "extra" pages. > +void dump_shared_info(struct domain *d) > +{ > + domain_lock(d); > + > + if ( d->shared_info.virt ) > + printk("Shared Info: %"PRI_mfn"\n", mfn_x(d->shared_info.mfn)); count_info and type_info should be logged imo, just like dump_pageframe_info() does. On the whole I think the actual dumping might better be uniform, and these functions would then only exist to "know" which page(s) to dump. > --- a/xen/common/time.c > +++ b/xen/common/time.c > @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d) > uint32_t *wc_version; > uint64_t sec; > > + if ( d != current->domain ) > + { > + /* > + * We need to check is_dying here as, if it is set, the > + * shared_info may have been freed. To do this safely we need > + * hold the domain lock. > + */ > + domain_lock(d); > + if ( d->is_dying ) > + goto unlock; > + } This shouldn't happen very often, but it's pretty heavy a lock. It's a fundamental aspect of xenheap pages that their disposal can b e delay until almost the last moment of guest cleanup. I continue to think it's not really a good ideal to have special purpose allocation (and mapping) accompanied by these pages getting taken care of by the generic relinquish-resources logic here (from a more general pov such is of course often nice to have). Instead of freeing these pages there, couldn't they just be taken off d->page_list, with the unmapping and freeing left as it was? Jan
> -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 09 March 2020 15:56 > To: paul@xen.org > Cc: xen-devel@lists.xenproject.org; Paul Durrant <pdurrant@amazon.com>; Stefano Stabellini > <sstabellini@kernel.org>; Julien Grall <julien@xen.org>; Volodymyr Babchuk > <Volodymyr_Babchuk@epam.com>; Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap > <george.dunlap@citrix.com>; Ian Jackson <ian.jackson@eu.citrix.com>; Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com>; Wei Liu <wl@xen.org> > Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info > > On 09.03.2020 11:23, paul@xen.org wrote: > > From: Paul Durrant <pdurrant@amazon.com> > > > > Currently shared_info is a shared xenheap page but shared xenheap pages > > complicate future plans for live-update of Xen so it is desirable to, > > where possible, not use them [1]. This patch therefore converts shared_info > > into a PGC_extra domheap page. This does entail freeing shared_info during > > domain_relinquish_resources() rather than domain_destroy() so care is > > needed to avoid de-referencing a NULL shared_info pointer hence some > > extra checks of 'is_dying' are needed. > > > > Also, because shared_info will no longer be a xenheap page this patch adds > > an extra dump function to make sure the shared_info MFN is included in the > > output of dump_pageframe_info(). > > I've voiced my objection to such a model, and hence it'll take > another REST maintainer to approve of this despite my arguments > against it. (And of course, just to re-record this here, the > APIC access page, converted by ea3daabff5f2, will want to get a > dumping function added then, too.) > > I wonder whether a domain's "extra" pages couldn't be put in a > separate, singly-linked list, using the union the next_shadow > field is in as the linking field. None of the other union > members look to be applicable to "extra" pages. > > > +void dump_shared_info(struct domain *d) > > +{ > > + domain_lock(d); > > + > > + if ( d->shared_info.virt ) > > + printk("Shared Info: %"PRI_mfn"\n", mfn_x(d->shared_info.mfn)); > > count_info and type_info should be logged imo, just like > dump_pageframe_info() does. On the whole I think the actual > dumping might better be uniform, and these functions would > then only exist to "know" which page(s) to dump. > I think the extra page list should cover these issues. > > --- a/xen/common/time.c > > +++ b/xen/common/time.c > > @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d) > > uint32_t *wc_version; > > uint64_t sec; > > > > + if ( d != current->domain ) > > + { > > + /* > > + * We need to check is_dying here as, if it is set, the > > + * shared_info may have been freed. To do this safely we need > > + * hold the domain lock. > > + */ > > + domain_lock(d); > > + if ( d->is_dying ) > > + goto unlock; > > + } > > This shouldn't happen very often, but it's pretty heavy a lock. > It's a fundamental aspect of xenheap pages that their disposal > can b e delay until almost the last moment of guest cleanup. I > continue to think it's not really a good ideal to have special > purpose allocation (and mapping) accompanied by these pages > getting taken care of by the generic relinquish-resources logic > here (from a more general pov such is of course often nice to > have). Instead of freeing these pages there, couldn't they just > be taken off d->page_list, with the unmapping and freeing left > as it was? I don't think this can be achieved without being able de-assign pages and I don't really want to have to invent new logic to do that (basically re-implementing what happens to xenheap pages). I really don't think it is that bad to deal with shared info and grant table pages as domheap pages. Yes, we have to be careful, but in this case the lock is only taken in a toolstack update of the wallclock and, apart from start of day access, I think this is limited to XEN_DOMCTL_settimeoffset and XEN_DOMCTL_setvcpucontext neither of which I believe is particularly performance-critical. Paul > > Jan
On 10.03.2020 18:33, Paul Durrant wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> Sent: 09 March 2020 15:56 >> >> On 09.03.2020 11:23, paul@xen.org wrote: >>> --- a/xen/common/time.c >>> +++ b/xen/common/time.c >>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d) >>> uint32_t *wc_version; >>> uint64_t sec; >>> >>> + if ( d != current->domain ) >>> + { >>> + /* >>> + * We need to check is_dying here as, if it is set, the >>> + * shared_info may have been freed. To do this safely we need >>> + * hold the domain lock. >>> + */ >>> + domain_lock(d); >>> + if ( d->is_dying ) >>> + goto unlock; >>> + } >> >> This shouldn't happen very often, but it's pretty heavy a lock. >> It's a fundamental aspect of xenheap pages that their disposal >> can b e delay until almost the last moment of guest cleanup. I >> continue to think it's not really a good ideal to have special >> purpose allocation (and mapping) accompanied by these pages >> getting taken care of by the generic relinquish-resources logic >> here (from a more general pov such is of course often nice to >> have). Instead of freeing these pages there, couldn't they just >> be taken off d->page_list, with the unmapping and freeing left >> as it was? > > I don't think this can be achieved without being able de-assign > pages and I don't really want to have to invent new logic to do > that (basically re-implementing what happens to xenheap pages). Where's the connection to being able to de-assign pages here? There'll be one when the same conversion is to be done for gnttab code, but I don't see it here - the shared info page is never to be de-assigned. As to gnttab code, I think it was noted before that we may be better off not "unpopulating" status pages when switching back from v2 to v1. At which point the de-assignment need would go away there, too. > I really don't think it is that bad to deal with shared info > and grant table pages as domheap pages. Yes, we have to be > careful, but in this case the lock is only taken in a > toolstack update of the wallclock and, apart from start of > day access, I think this is limited to XEN_DOMCTL_settimeoffset > and XEN_DOMCTL_setvcpucontext neither of which I believe is > particularly performance-critical. It's not, I agree (and hence I had started my previous reply also with "This shouldn't happen very often"). How all of this is going to look like with the new extra_page_list I'll have to see anyway. But for now I remain unconvinced of the want / need to de-allocate the shared info page early. Jan
> -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 11 March 2020 09:17 > To: paul@xen.org > Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Stefano Stabellini' > <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk' > <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' > <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk' > <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org> > Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info > > On 10.03.2020 18:33, Paul Durrant wrote: > >> -----Original Message----- > >> From: Jan Beulich <jbeulich@suse.com> > >> Sent: 09 March 2020 15:56 > >> > >> On 09.03.2020 11:23, paul@xen.org wrote: > >>> --- a/xen/common/time.c > >>> +++ b/xen/common/time.c > >>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d) > >>> uint32_t *wc_version; > >>> uint64_t sec; > >>> > >>> + if ( d != current->domain ) > >>> + { > >>> + /* > >>> + * We need to check is_dying here as, if it is set, the > >>> + * shared_info may have been freed. To do this safely we need > >>> + * hold the domain lock. > >>> + */ > >>> + domain_lock(d); > >>> + if ( d->is_dying ) > >>> + goto unlock; > >>> + } > >> > >> This shouldn't happen very often, but it's pretty heavy a lock. > >> It's a fundamental aspect of xenheap pages that their disposal > >> can b e delay until almost the last moment of guest cleanup. I > >> continue to think it's not really a good ideal to have special > >> purpose allocation (and mapping) accompanied by these pages > >> getting taken care of by the generic relinquish-resources logic > >> here (from a more general pov such is of course often nice to > >> have). Instead of freeing these pages there, couldn't they just > >> be taken off d->page_list, with the unmapping and freeing left > >> as it was? > > > > I don't think this can be achieved without being able de-assign > > pages and I don't really want to have to invent new logic to do > > that (basically re-implementing what happens to xenheap pages). > > Where's the connection to being able to de-assign pages here? > There'll be one when the same conversion is to be done for > gnttab code, but I don't see it here - the shared info page is > never to be de-assigned. As to gnttab code, I think it was > noted before that we may be better off not "unpopulating" > status pages when switching back from v2 to v1. At which point > the de-assignment need would go away there, too. Ok, maybe I'm misunderstanding something then. We need to call free_domheap_pages() on all pages assigned to a domain so that the domain references get dropped. The xenpage ref is dropped when d->xenheap_pages == 0. The domheap ref is dropped when domain_adjust_tot_pages() returns zero. (This is what I meant by de-assigning... but that was probably a poor choice of words). So, because domain_adjust_tot_pages() returns d->tot_pages (which includes the extra_pages count) it won't fall to zero until the last put_page() on any PGC_extra page. So how is it possible to free shared_info in domain destroy? We'll never get that far, because the domheap ref will never get dropped. I guess this could be fixed by having domain_adjust_tot_pages() return the same values as domain_tot_pages() (i.e. tot_pages - extra_pages). Is that what you're suggesting? > > > I really don't think it is that bad to deal with shared info > > and grant table pages as domheap pages. Yes, we have to be > > careful, but in this case the lock is only taken in a > > toolstack update of the wallclock and, apart from start of > > day access, I think this is limited to XEN_DOMCTL_settimeoffset > > and XEN_DOMCTL_setvcpucontext neither of which I believe is > > particularly performance-critical. > > It's not, I agree (and hence I had started my previous reply > also with "This shouldn't happen very often"). How all of this > is going to look like with the new extra_page_list I'll have > to see anyway. But for now I remain unconvinced of the want / > need to de-allocate the shared info page early. > Well hopefully I've explained above why that is currently necessary if it becomes a domheap page. Paul
On 11.03.2020 16:28, Paul Durrant wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> Sent: 11 March 2020 09:17 >> To: paul@xen.org >> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Stefano Stabellini' >> <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk' >> <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' >> <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk' >> <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org> >> Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info >> >> On 10.03.2020 18:33, Paul Durrant wrote: >>>> -----Original Message----- >>>> From: Jan Beulich <jbeulich@suse.com> >>>> Sent: 09 March 2020 15:56 >>>> >>>> On 09.03.2020 11:23, paul@xen.org wrote: >>>>> --- a/xen/common/time.c >>>>> +++ b/xen/common/time.c >>>>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d) >>>>> uint32_t *wc_version; >>>>> uint64_t sec; >>>>> >>>>> + if ( d != current->domain ) >>>>> + { >>>>> + /* >>>>> + * We need to check is_dying here as, if it is set, the >>>>> + * shared_info may have been freed. To do this safely we need >>>>> + * hold the domain lock. >>>>> + */ >>>>> + domain_lock(d); >>>>> + if ( d->is_dying ) >>>>> + goto unlock; >>>>> + } >>>> >>>> This shouldn't happen very often, but it's pretty heavy a lock. >>>> It's a fundamental aspect of xenheap pages that their disposal >>>> can b e delay until almost the last moment of guest cleanup. I >>>> continue to think it's not really a good ideal to have special >>>> purpose allocation (and mapping) accompanied by these pages >>>> getting taken care of by the generic relinquish-resources logic >>>> here (from a more general pov such is of course often nice to >>>> have). Instead of freeing these pages there, couldn't they just >>>> be taken off d->page_list, with the unmapping and freeing left >>>> as it was? >>> >>> I don't think this can be achieved without being able de-assign >>> pages and I don't really want to have to invent new logic to do >>> that (basically re-implementing what happens to xenheap pages). >> >> Where's the connection to being able to de-assign pages here? >> There'll be one when the same conversion is to be done for >> gnttab code, but I don't see it here - the shared info page is >> never to be de-assigned. As to gnttab code, I think it was >> noted before that we may be better off not "unpopulating" >> status pages when switching back from v2 to v1. At which point >> the de-assignment need would go away there, too. > > Ok, maybe I'm misunderstanding something then. We need to call > free_domheap_pages() on all pages assigned to a domain so that > the domain references get dropped. The xenpage ref is dropped > when d->xenheap_pages == 0. The domheap ref is dropped when > domain_adjust_tot_pages() returns zero. (This is what I meant > by de-assigning... but that was probably a poor choice of words). > So, because domain_adjust_tot_pages() returns d->tot_pages > (which includes the extra_pages count) it won't fall to zero > until the last put_page() on any PGC_extra page. So how is it > possible to free shared_info in domain destroy? We'll never get > that far, because the domheap ref will never get dropped. Well, now that these pages sit on a separate list, it would look even less problematic than before to me to also give them special treatment here: You wouldn't even have to take them off the list anymore, but just call domain_adjust_tot_pages() with -d->extra_pages (and suitably deal with the return value; perhaps for consistency then followed by also zeroing d->extra_pages, so overall accounting still looks correct). Actually taking these pages off the list could (for dumping purposes) then be done alongside their actual freeing. Such a transition would apparently imply clearing PGC_extra alongside the new domain_adjust_tot_pages() call. I realize though that the end result won't be much different from the current PGC_xen_heap handling (at least as far as domain cleanup goes, but after all that's what I'm in fact trying to convince you of), so the question would be whether the whole transition then is worth it. Without having seen at least a sketch of the LU code that is affected by the current behavior, it remains hard for me to tell what issues might remain, despite your and David's explanations. > I > guess this could be fixed by having domain_adjust_tot_pages() > return the same values as domain_tot_pages() (i.e. > tot_pages - extra_pages). Is that what you're suggesting? That's an option, too, but imo less desirable. Jan
> -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 12 March 2020 08:34 > To: paul@xen.org > Cc: xen-devel@lists.xenproject.org; Durrant, Paul <pdurrant@amazon.co.uk>; 'Stefano Stabellini' > <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk' > <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' > <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk' > <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org> > Subject: RE: [EXTERNAL][PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info > > CAUTION: This email originated from outside of the organization. Do not click links or open > attachments unless you can confirm the sender and know the content is safe. > > > > On 11.03.2020 16:28, Paul Durrant wrote: > >> -----Original Message----- > >> From: Jan Beulich <jbeulich@suse.com> > >> Sent: 11 March 2020 09:17 > >> To: paul@xen.org > >> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Stefano Stabellini' > >> <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk' > >> <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' > >> <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk' > >> <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org> > >> Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info > >> > >> On 10.03.2020 18:33, Paul Durrant wrote: > >>>> -----Original Message----- > >>>> From: Jan Beulich <jbeulich@suse.com> > >>>> Sent: 09 March 2020 15:56 > >>>> > >>>> On 09.03.2020 11:23, paul@xen.org wrote: > >>>>> --- a/xen/common/time.c > >>>>> +++ b/xen/common/time.c > >>>>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d) > >>>>> uint32_t *wc_version; > >>>>> uint64_t sec; > >>>>> > >>>>> + if ( d != current->domain ) > >>>>> + { > >>>>> + /* > >>>>> + * We need to check is_dying here as, if it is set, the > >>>>> + * shared_info may have been freed. To do this safely we need > >>>>> + * hold the domain lock. > >>>>> + */ > >>>>> + domain_lock(d); > >>>>> + if ( d->is_dying ) > >>>>> + goto unlock; > >>>>> + } > >>>> > >>>> This shouldn't happen very often, but it's pretty heavy a lock. > >>>> It's a fundamental aspect of xenheap pages that their disposal > >>>> can b e delay until almost the last moment of guest cleanup. I > >>>> continue to think it's not really a good ideal to have special > >>>> purpose allocation (and mapping) accompanied by these pages > >>>> getting taken care of by the generic relinquish-resources logic > >>>> here (from a more general pov such is of course often nice to > >>>> have). Instead of freeing these pages there, couldn't they just > >>>> be taken off d->page_list, with the unmapping and freeing left > >>>> as it was? > >>> > >>> I don't think this can be achieved without being able de-assign > >>> pages and I don't really want to have to invent new logic to do > >>> that (basically re-implementing what happens to xenheap pages). > >> > >> Where's the connection to being able to de-assign pages here? > >> There'll be one when the same conversion is to be done for > >> gnttab code, but I don't see it here - the shared info page is > >> never to be de-assigned. As to gnttab code, I think it was > >> noted before that we may be better off not "unpopulating" > >> status pages when switching back from v2 to v1. At which point > >> the de-assignment need would go away there, too. > > > > Ok, maybe I'm misunderstanding something then. We need to call > > free_domheap_pages() on all pages assigned to a domain so that > > the domain references get dropped. The xenpage ref is dropped > > when d->xenheap_pages == 0. The domheap ref is dropped when > > domain_adjust_tot_pages() returns zero. (This is what I meant > > by de-assigning... but that was probably a poor choice of words). > > So, because domain_adjust_tot_pages() returns d->tot_pages > > (which includes the extra_pages count) it won't fall to zero > > until the last put_page() on any PGC_extra page. So how is it > > possible to free shared_info in domain destroy? We'll never get > > that far, because the domheap ref will never get dropped. > > Well, now that these pages sit on a separate list, it would > look even less problematic than before to me to also give them > special treatment here: You wouldn't even have to take them > off the list anymore, but just call domain_adjust_tot_pages() > with -d->extra_pages (and suitably deal with the return value; > perhaps for consistency then followed by also zeroing > d->extra_pages, so overall accounting still looks correct). > Actually taking these pages off the list could (for dumping > purposes) then be done alongside their actual freeing. Such a > transition would apparently imply clearing PGC_extra alongside > the new domain_adjust_tot_pages() call. > > I realize though that the end result won't be much different > from the current PGC_xen_heap handling (at least as far as > domain cleanup goes, but after all that's what I'm in fact > trying to convince you of), so the question would be whether > the whole transition then is worth it. Yes... with such adjustments the code gets increasingly pointless from a simplification PoV... which is why I coded this patch as it is. I am happy to make the concession of using the extra page list for dumping purposes, and to avoid the need to special-case PGC_extra pages in a few places, but otherwise I want them to be treated as 'normal' domheap pages. So, if you're not willing to ack this patch then I guess I'll have to re-send the series including the grant table changes and the removal of the xenpage list. Paul
On 12.03.2020 10:09, Paul Durrant wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> Sent: 12 March 2020 08:34 >> To: paul@xen.org >> Cc: xen-devel@lists.xenproject.org; Durrant, Paul <pdurrant@amazon.co.uk>; 'Stefano Stabellini' >> <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk' >> <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' >> <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk' >> <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org> >> Subject: RE: [EXTERNAL][PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info >> >> CAUTION: This email originated from outside of the organization. Do not click links or open >> attachments unless you can confirm the sender and know the content is safe. >> >> >> >> On 11.03.2020 16:28, Paul Durrant wrote: >>>> -----Original Message----- >>>> From: Jan Beulich <jbeulich@suse.com> >>>> Sent: 11 March 2020 09:17 >>>> To: paul@xen.org >>>> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Stefano Stabellini' >>>> <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk' >>>> <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' >>>> <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk' >>>> <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org> >>>> Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info >>>> >>>> On 10.03.2020 18:33, Paul Durrant wrote: >>>>>> -----Original Message----- >>>>>> From: Jan Beulich <jbeulich@suse.com> >>>>>> Sent: 09 March 2020 15:56 >>>>>> >>>>>> On 09.03.2020 11:23, paul@xen.org wrote: >>>>>>> --- a/xen/common/time.c >>>>>>> +++ b/xen/common/time.c >>>>>>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d) >>>>>>> uint32_t *wc_version; >>>>>>> uint64_t sec; >>>>>>> >>>>>>> + if ( d != current->domain ) >>>>>>> + { >>>>>>> + /* >>>>>>> + * We need to check is_dying here as, if it is set, the >>>>>>> + * shared_info may have been freed. To do this safely we need >>>>>>> + * hold the domain lock. >>>>>>> + */ >>>>>>> + domain_lock(d); >>>>>>> + if ( d->is_dying ) >>>>>>> + goto unlock; >>>>>>> + } >>>>>> >>>>>> This shouldn't happen very often, but it's pretty heavy a lock. >>>>>> It's a fundamental aspect of xenheap pages that their disposal >>>>>> can b e delay until almost the last moment of guest cleanup. I >>>>>> continue to think it's not really a good ideal to have special >>>>>> purpose allocation (and mapping) accompanied by these pages >>>>>> getting taken care of by the generic relinquish-resources logic >>>>>> here (from a more general pov such is of course often nice to >>>>>> have). Instead of freeing these pages there, couldn't they just >>>>>> be taken off d->page_list, with the unmapping and freeing left >>>>>> as it was? >>>>> >>>>> I don't think this can be achieved without being able de-assign >>>>> pages and I don't really want to have to invent new logic to do >>>>> that (basically re-implementing what happens to xenheap pages). >>>> >>>> Where's the connection to being able to de-assign pages here? >>>> There'll be one when the same conversion is to be done for >>>> gnttab code, but I don't see it here - the shared info page is >>>> never to be de-assigned. As to gnttab code, I think it was >>>> noted before that we may be better off not "unpopulating" >>>> status pages when switching back from v2 to v1. At which point >>>> the de-assignment need would go away there, too. >>> >>> Ok, maybe I'm misunderstanding something then. We need to call >>> free_domheap_pages() on all pages assigned to a domain so that >>> the domain references get dropped. The xenpage ref is dropped >>> when d->xenheap_pages == 0. The domheap ref is dropped when >>> domain_adjust_tot_pages() returns zero. (This is what I meant >>> by de-assigning... but that was probably a poor choice of words). >>> So, because domain_adjust_tot_pages() returns d->tot_pages >>> (which includes the extra_pages count) it won't fall to zero >>> until the last put_page() on any PGC_extra page. So how is it >>> possible to free shared_info in domain destroy? We'll never get >>> that far, because the domheap ref will never get dropped. >> >> Well, now that these pages sit on a separate list, it would >> look even less problematic than before to me to also give them >> special treatment here: You wouldn't even have to take them >> off the list anymore, but just call domain_adjust_tot_pages() >> with -d->extra_pages (and suitably deal with the return value; >> perhaps for consistency then followed by also zeroing >> d->extra_pages, so overall accounting still looks correct). >> Actually taking these pages off the list could (for dumping >> purposes) then be done alongside their actual freeing. Such a >> transition would apparently imply clearing PGC_extra alongside >> the new domain_adjust_tot_pages() call. >> >> I realize though that the end result won't be much different >> from the current PGC_xen_heap handling (at least as far as >> domain cleanup goes, but after all that's what I'm in fact >> trying to convince you of), so the question would be whether >> the whole transition then is worth it. > > Yes... with such adjustments the code gets increasingly pointless from a simplification PoV... which is why I coded this patch as it is. I am happy to make the concession of using the extra page list for dumping purposes, and to avoid the need to special-case PGC_extra pages in a few places, but otherwise I want them to be treated as 'normal' domheap pages. > > So, if you're not willing to ack this patch Well - I'm not the only one in the position to ack this. The problem is that right now this is just a discussion between the two of us. > then I guess I'll have to re-send the series including the > grant table changes and the removal of the xenpage list. I'm not sure I see how this would help. My reservations against the earlier freeing of the extra pages wouldn't go away. But let me first get around to look at v6. Jan
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 5298d80bd2..741f6dd444 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -1005,6 +1005,8 @@ int domain_relinquish_resources(struct domain *d) BUG(); } + free_shared_info(d); + return 0; } diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index f6ed25e8ee..60623beba1 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -261,6 +261,8 @@ void dump_pageframe_info(struct domain *d) page->count_info, page->u.inuse.type_info); } spin_unlock(&d->page_alloc_lock); + + dump_shared_info(d); } void update_guest_memory_policy(struct vcpu *v, @@ -693,7 +695,6 @@ void arch_domain_destroy(struct domain *d) pv_domain_destroy(d); free_perdomain_mappings(d); - free_shared_info(d); cleanup_domain_irq_mapping(d); psr_domain_free(d); @@ -2249,6 +2250,8 @@ int domain_relinquish_resources(struct domain *d) if ( is_hvm_domain(d) ) hvm_domain_relinquish_resources(d); + free_shared_info(d); + return 0; } diff --git a/xen/common/domain.c b/xen/common/domain.c index ba7a905258..0b1c722708 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -1650,24 +1650,54 @@ int continue_hypercall_on_cpu( int alloc_shared_info(struct domain *d, unsigned int memflags) { - if ( (d->shared_info.virt = alloc_xenheap_pages(0, memflags)) == NULL ) + struct page_info *pg; + + pg = alloc_domheap_page(d, MEMF_no_refcount | memflags); + if ( !pg ) return -ENOMEM; - d->shared_info.mfn = virt_to_mfn(d->shared_info.virt); + if ( !get_page_and_type(pg, d, PGT_writable_page) ) + { + /* + * The domain should not be running at this point so there is + * no way we should reach this error path. + */ + ASSERT_UNREACHABLE(); + return -ENODATA; + } + + d->shared_info.mfn = page_to_mfn(pg); + d->shared_info.virt = __map_domain_page_global(pg); clear_page(d->shared_info.virt); - share_xen_page_with_guest(mfn_to_page(d->shared_info.mfn), d, SHARE_rw); return 0; } void free_shared_info(struct domain *d) { + struct page_info *pg; + if ( !d->shared_info.virt ) return; - free_xenheap_page(d->shared_info.virt); + unmap_domain_page_global(d->shared_info.virt); d->shared_info.virt = NULL; + + pg = mfn_to_page(d->shared_info.mfn); + + put_page_alloc_ref(pg); + put_page_and_type(pg); +} + +void dump_shared_info(struct domain *d) +{ + domain_lock(d); + + if ( d->shared_info.virt ) + printk("Shared Info: %"PRI_mfn"\n", mfn_x(d->shared_info.mfn)); + + domain_unlock(d); } /* diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index e86e2bfab0..a17422284d 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1325,6 +1325,9 @@ void evtchn_destroy(struct domain *d) { unsigned int i; + /* This must be done before shared_info is freed */ + BUG_ON(!d->shared_info.virt); + /* After this barrier no new event-channel allocations can occur. */ BUG_ON(!d->is_dying); spin_barrier(&d->event_lock); diff --git a/xen/common/time.c b/xen/common/time.c index 58fa9abc40..ada02faf07 100644 --- a/xen/common/time.c +++ b/xen/common/time.c @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d) uint32_t *wc_version; uint64_t sec; + if ( d != current->domain ) + { + /* + * We need to check is_dying here as, if it is set, the + * shared_info may have been freed. To do this safely we need + * hold the domain lock. + */ + domain_lock(d); + if ( d->is_dying ) + goto unlock; + } + spin_lock(&wc_lock); wc_version = &shared_info(d, wc_version); @@ -121,6 +133,9 @@ void update_domain_wallclock_time(struct domain *d) *wc_version = version_update_end(*wc_version); spin_unlock(&wc_lock); + unlock: + if ( d != current->domain ) + domain_unlock(d); } /* Set clock to <secs,usecs> after 00:00:00 UTC, 1 January, 1970. */ diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h index 740e2032ad..740c0537fa 100644 --- a/xen/include/xen/domain.h +++ b/xen/include/xen/domain.h @@ -132,5 +132,6 @@ void vnuma_destroy(struct vnuma_info *vnuma); int alloc_shared_info(struct domain *d, unsigned int memflags); void free_shared_info(struct domain *d); +void dump_shared_info(struct domain *d); #endif /* __XEN_DOMAIN_H__ */