Message ID | 20221014080917.14980-1-Henry.Wang@arm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] xen/arm: p2m: Populate pages for GICv2 mapping in arch_domain_create() | expand |
On 14.10.2022 10:09, Henry Wang wrote: > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -740,6 +740,20 @@ int arch_domain_create(struct domain *d, > BUG(); > } > > + /* > + * Hardware using GICv2 needs to create a P2M mapping of 8KB GICv2 area > + * when the domain is created. Considering the worst case for page > + * tables and keep a buffer, populate 16 pages to the P2M pages pool here. > + * For GICv3, the above-mentioned P2M mapping is not necessary, but since > + * the allocated 16 pages here would not be lost, hence populate these > + * pages unconditionally. > + */ > + spin_lock(&d->arch.paging.lock); > + rc = p2m_set_allocation(d, 16, NULL); > + spin_unlock(&d->arch.paging.lock); > + if ( rc != 0 ) > + goto fail; Putting this level of knowledge here feels like a layering violation to me. My first suggestion would be to move this call somewhere under p2m_init(). If that's not possible for some reason, I'd like to suggest passing 1 here as the count and then adding a min-acceptable check to p2m_set_allocation() along the lines of x86'es shadow_set_allocation(). That way you'd also guarantee the minimum number of pages in case a subsequent tiny allocation request came in via domctl. > @@ -1736,6 +1736,17 @@ void p2m_final_teardown(struct domain *d) > if ( !p2m->domain ) > return; > > + if ( !page_list_empty(&p2m->pages) ) > + p2m_teardown(d, false); > + > + if ( d->arch.paging.p2m_total_pages != 0 ) > + { > + spin_lock(&d->arch.paging.lock); > + p2m_set_allocation(d, 0, NULL); > + spin_unlock(&d->arch.paging.lock); > + ASSERT(d->arch.paging.p2m_total_pages == 0); > + } Is it intentional to largely open-code p2m_teardown_allocation() here? Jan
Hi Jan, Thanks for the review. > -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in > arch_domain_create() > > On 14.10.2022 10:09, Henry Wang wrote: > > --- a/xen/arch/arm/domain.c > > +++ b/xen/arch/arm/domain.c > > @@ -740,6 +740,20 @@ int arch_domain_create(struct domain *d, > > BUG(); > > } > > > > + /* > > + * Hardware using GICv2 needs to create a P2M mapping of 8KB GICv2 > area > > + * when the domain is created. Considering the worst case for page > > + * tables and keep a buffer, populate 16 pages to the P2M pages pool > here. > > + * For GICv3, the above-mentioned P2M mapping is not necessary, but > since > > + * the allocated 16 pages here would not be lost, hence populate these > > + * pages unconditionally. > > + */ > > + spin_lock(&d->arch.paging.lock); > > + rc = p2m_set_allocation(d, 16, NULL); > > + spin_unlock(&d->arch.paging.lock); > > + if ( rc != 0 ) > > + goto fail; > > Putting this level of knowledge here feels like a layering violation to > me. My first suggestion would be to move this call somewhere under > p2m_init(). That is definitely possible. If Julien or other Arm maintainers are not against that I am happy to move this to p2m_init() in v3. The reason why the above block is placed here is just I thought to use d->arch.vgic.version to only populate the 16 pages for GICv2 in the beginning, and d->arch.vgic.version is first assigned later after p2m_init(), but later we decided to populated the pages unconditionally so actually now we can move the part to p2m_init(). > If that's not possible for some reason, I'd like to suggest > passing 1 here as the count and then adding a min-acceptable check to > p2m_set_allocation() along the lines of x86'es shadow_set_allocation(). > That way you'd also guarantee the minimum number of pages in case a > subsequent tiny allocation request came in via domctl. > > > @@ -1736,6 +1736,17 @@ void p2m_final_teardown(struct domain *d) > > if ( !p2m->domain ) > > return; > > > > + if ( !page_list_empty(&p2m->pages) ) > > + p2m_teardown(d, false); > > + > > + if ( d->arch.paging.p2m_total_pages != 0 ) > > + { > > + spin_lock(&d->arch.paging.lock); > > + p2m_set_allocation(d, 0, NULL); > > + spin_unlock(&d->arch.paging.lock); > > + ASSERT(d->arch.paging.p2m_total_pages == 0); > > + } > > Is it intentional to largely open-code p2m_teardown_allocation() here? Yes, AFAICT p2m_teardown_allocation() is preemptible and we don't want any preemption here. Kind regards, Henry > > Jan
On 14.10.2022 11:28, Henry Wang wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> >> On 14.10.2022 10:09, Henry Wang wrote: >>> @@ -1736,6 +1736,17 @@ void p2m_final_teardown(struct domain *d) >>> if ( !p2m->domain ) >>> return; >>> >>> + if ( !page_list_empty(&p2m->pages) ) >>> + p2m_teardown(d, false); >>> + >>> + if ( d->arch.paging.p2m_total_pages != 0 ) >>> + { >>> + spin_lock(&d->arch.paging.lock); >>> + p2m_set_allocation(d, 0, NULL); >>> + spin_unlock(&d->arch.paging.lock); >>> + ASSERT(d->arch.paging.p2m_total_pages == 0); >>> + } >> >> Is it intentional to largely open-code p2m_teardown_allocation() here? > > Yes, AFAICT p2m_teardown_allocation() is preemptible and we don't want > any preemption here. Well, this can be dealt with by adding a parameter to the function, or by looping over it until it returns other than -ERESTART. Both would seem better to me than this duplication of functionality (but I'm not a maintainer of this code, as you know). Jan
On 14/10/2022 10:28, Henry Wang wrote: > Hi Jan, > > Thanks for the review. > >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in >> arch_domain_create() >> >> On 14.10.2022 10:09, Henry Wang wrote: >>> --- a/xen/arch/arm/domain.c >>> +++ b/xen/arch/arm/domain.c >>> @@ -740,6 +740,20 @@ int arch_domain_create(struct domain *d, >>> BUG(); >>> } >>> >>> + /* >>> + * Hardware using GICv2 needs to create a P2M mapping of 8KB GICv2 >> area >>> + * when the domain is created. Considering the worst case for page >>> + * tables and keep a buffer, populate 16 pages to the P2M pages pool >> here. >>> + * For GICv3, the above-mentioned P2M mapping is not necessary, but >> since >>> + * the allocated 16 pages here would not be lost, hence populate these >>> + * pages unconditionally. >>> + */ >>> + spin_lock(&d->arch.paging.lock); >>> + rc = p2m_set_allocation(d, 16, NULL); >>> + spin_unlock(&d->arch.paging.lock); >>> + if ( rc != 0 ) >>> + goto fail; >> >> Putting this level of knowledge here feels like a layering violation to >> me. My first suggestion would be to move this call somewhere under >> p2m_init(). > > That is definitely possible. If Julien or other Arm maintainers are not > against that I am happy to move this to p2m_init() in v3. I understand both of Jan and your concern. I don't really have a strong opinion either way. You are the author of the patch, so I will let you chose. > > The reason why the above block is placed here is just I thought to use > d->arch.vgic.version to only populate the 16 pages for GICv2 in the > beginning, and d->arch.vgic.version is first assigned later after p2m_init(), > but later we decided to populated the pages unconditionally so actually > now we can move the part to p2m_init(). > >> If that's not possible for some reason, I'd like to suggest >> passing 1 here as the count and then adding a min-acceptable check to >> p2m_set_allocation() along the lines of x86'es shadow_set_allocation(). >> That way you'd also guarantee the minimum number of pages in case a >> subsequent tiny allocation request came in via domctl. I really dislike this. If the user ask for 1 pages and we only allow 16. Then this should be rejected (not bumped to 16). However, the code in p2m_set_allocation() will only look at the free pages (like x86 does). So what you suggest would not do what you want. >> >>> @@ -1736,6 +1736,17 @@ void p2m_final_teardown(struct domain *d) >>> if ( !p2m->domain ) >>> return; >>> >>> + if ( !page_list_empty(&p2m->pages) ) >>> + p2m_teardown(d, false); >>> + >>> + if ( d->arch.paging.p2m_total_pages != 0 ) >>> + { >>> + spin_lock(&d->arch.paging.lock); >>> + p2m_set_allocation(d, 0, NULL); >>> + spin_unlock(&d->arch.paging.lock); >>> + ASSERT(d->arch.paging.p2m_total_pages == 0); >>> + } >> >> Is it intentional to largely open-code p2m_teardown_allocation() here? > > Yes, AFAICT p2m_teardown_allocation() is preemptible and we don't want > any preemption here. Like Jan, I would prefer if we can avoid the duplication. The loop suggested by Jan should work. Cheers,
Hi Henry, On 14/10/2022 09:09, Henry Wang wrote: > Hardware using GICv2 needs to create a P2M mapping of 8KB GICv2 area > when the domain is created. Considering the worst case of page tables > which requires 6 P2M pages as the two pages will be consecutive but not > necessarily in the same L3 page table and keep a buffer, populate 16 > pages as the default value to the P2M pages pool in arch_domain_create() > at the domain creation stage to satisfy the GICv2 requirement. For > GICv3, the above-mentioned P2M mapping is not necessary, but since the > allocated 16 pages here would not be lost, hence populate these pages > unconditionally. > > With the default 16 P2M pages populated, there would be a case that > failures would happen in the domain creation with P2M pages already in > use. To properly free the P2M for this case, firstly support the > optionally preemption of p2m_teardown(), then call p2m_teardown() and > p2m_set_allocation(d, 0, NULL) in p2m_final_teardown() if needed. > > Fixes: cbea5a1149ca ("xen/arm: Allocate and free P2M pages from the P2M pool") > Suggested-by: Julien Grall <jgrall@amazon.com> > Signed-off-by: Henry Wang <Henry.Wang@arm.com> > --- > This should also be backported to 4.13, 4.14, 4.15 and 4.16. > v2 changes: > - Move the p2m_set_allocation(d, 0, NULL); to p2m_final_teardown(). > - Support optionally preemption of p2m_teardown(), and make the calling of > p2m_teardown() preemptively when relinquish the resources, non-preemptively > in p2m_final_teardown(). > - Refactor the error handling to make the code use less spin_unlock. > - Explain the worst case of page tables and the unconditional population > of pages in commit message. > - Mention the unconditional population of pages in in-code comment. > --- > xen/arch/arm/domain.c | 16 +++++++++++++++- > xen/arch/arm/include/asm/p2m.h | 11 +++++++---- > xen/arch/arm/p2m.c | 15 +++++++++++++-- > 3 files changed, 35 insertions(+), 7 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index 2c84e6dbbb..831e248ad7 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -740,6 +740,20 @@ int arch_domain_create(struct domain *d, > BUG(); > } > > + /* > + * Hardware using GICv2 needs to create a P2M mapping of 8KB GICv2 area > + * when the domain is created. Considering the worst case for page > + * tables and keep a buffer, populate 16 pages to the P2M pages pool here. > + * For GICv3, the above-mentioned P2M mapping is not necessary, but since > + * the allocated 16 pages here would not be lost, hence populate these > + * pages unconditionally. > + */ > + spin_lock(&d->arch.paging.lock); > + rc = p2m_set_allocation(d, 16, NULL); > + spin_unlock(&d->arch.paging.lock); > + if ( rc != 0 ) > + goto fail; > + > if ( (rc = domain_vgic_register(d, &count)) != 0 ) > goto fail; > > @@ -1064,7 +1078,7 @@ int domain_relinquish_resources(struct domain *d) > return ret; > > PROGRESS(p2m): > - ret = p2m_teardown(d); > + ret = p2m_teardown(d, true); > if ( ret ) > return ret; > > diff --git a/xen/arch/arm/include/asm/p2m.h b/xen/arch/arm/include/asm/p2m.h > index 42bfd548c4..480d65e95e 100644 > --- a/xen/arch/arm/include/asm/p2m.h > +++ b/xen/arch/arm/include/asm/p2m.h > @@ -194,14 +194,17 @@ int p2m_init(struct domain *d); > > /* > * The P2M resources are freed in two parts: > - * - p2m_teardown() will be called when relinquish the resources. It > - * will free large resources (e.g. intermediate page-tables) that > - * requires preemption. > + * - p2m_teardown() will be called preemptively when relinquish the > + * resources, in which case it will free large resources (e.g. intermediate > + * page-tables) that requires preemption. > * - p2m_final_teardown() will be called when domain struct is been > * freed. This *cannot* be preempted and therefore one small > * resources should be freed here. > + * Note that p2m_final_teardown() will also call p2m_teardown(), to properly > + * free the P2M when failures happen in the domain creation with P2M pages > + * already in use. In this case p2m_teardown() is called non-preemptively. > */ > -int p2m_teardown(struct domain *d); > +int p2m_teardown(struct domain *d, bool allow_preemption); > void p2m_final_teardown(struct domain *d); > > /* > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c > index f17500ddf3..707bd3e2e3 100644 > --- a/xen/arch/arm/p2m.c > +++ b/xen/arch/arm/p2m.c > @@ -1685,7 +1685,7 @@ static void p2m_free_vmid(struct domain *d) > spin_unlock(&vmid_alloc_lock); > } > > -int p2m_teardown(struct domain *d) > +int p2m_teardown(struct domain *d, bool allow_preemption) > { I think the part to clean & invalidate the root should not be necessary if the domain is not scheduled. Similarly, I think we might only need to do once by domain (rather than for every call). So I would consider to move the logic outside of the function. That's not for 4.17 thought. > struct p2m_domain *p2m = p2m_get_hostp2m(d); > unsigned long count = 0; > @@ -1716,7 +1716,7 @@ int p2m_teardown(struct domain *d) > p2m_free_page(p2m->domain, pg); > count++; > /* Arbitrarily preempt every 512 iterations */ > - if ( !(count % 512) && hypercall_preempt_check() ) > + if ( allow_preemption && !(count % 512) && hypercall_preempt_check() ) > { > rc = -ERESTART; > break; > @@ -1736,6 +1736,17 @@ void p2m_final_teardown(struct domain *d) > if ( !p2m->domain ) > return; > > + if ( !page_list_empty(&p2m->pages) ) Did you add this check to avoid the clean & invalidate if the list is empty? > + p2m_teardown(d, false); Today, it should be fine to ignore p2m_teardown(). But I would prefer if we add an ASSERT()/BUG_ON() (or else) to make confirm this is the case. This also wants to be documented on top of p2m_teardown() as it would be easier to know that the function should always return 0 when !allow_preemption is not set. I also noticed that relinquish_p2m_mapping() is not called. This should be fine for us because arch_domain_create() should never create a mapping that requires p2m_put_l3_page() to be called. I think it would be good to check it in __p2m_set_entry(). So we don't end up to add such mappings by mistake. I would have suggested to add a comment only for version and send a follow-up patch. But I don't exactly know where to put it. > + > + if ( d->arch.paging.p2m_total_pages != 0 ) > + { > + spin_lock(&d->arch.paging.lock); > + p2m_set_allocation(d, 0, NULL); > + spin_unlock(&d->arch.paging.lock); > + ASSERT(d->arch.paging.p2m_total_pages == 0); > + } > + > ASSERT(page_list_empty(&p2m->pages)); I would move this assert between the two ifs you added. > ASSERT(page_list_empty(&d->arch.paging.p2m_freelist)); > Cheers,
Hi Julien, > -----Original Message----- > From: Julien Grall <julien@xen.org> > Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in > arch_domain_create() > >>> + spin_lock(&d->arch.paging.lock); > >>> + rc = p2m_set_allocation(d, 16, NULL); > >>> + spin_unlock(&d->arch.paging.lock); > >>> + if ( rc != 0 ) > >>> + goto fail; > >> > >> Putting this level of knowledge here feels like a layering violation to > >> me. My first suggestion would be to move this call somewhere under > >> p2m_init(). > > > > That is definitely possible. If Julien or other Arm maintainers are not > > against that I am happy to move this to p2m_init() in v3. > I understand both of Jan and your concern. I don't really have a strong > opinion either way. > > You are the author of the patch, so I will let you chose. Then p2m_init(), just want to make everyone happy :))) > > >>> + if ( d->arch.paging.p2m_total_pages != 0 ) > >>> + { > >>> + spin_lock(&d->arch.paging.lock); > >>> + p2m_set_allocation(d, 0, NULL); > >>> + spin_unlock(&d->arch.paging.lock); > >>> + ASSERT(d->arch.paging.p2m_total_pages == 0); > >>> + } > >> > >> Is it intentional to largely open-code p2m_teardown_allocation() here? > > > > Yes, AFAICT p2m_teardown_allocation() is preemptible and we don't want > > any preemption here. > > Like Jan, I would prefer if we can avoid the duplication. The loop > suggested by Jan should work. I am a little bit worried about the -ENOMEM, if -ENOMEM is returned from p2m_teardown_allocation(d), I think we are in the infinite loop, or did I miss understood the loop that Jan referred to? Kind regards, Henry > > Cheers, > > -- > Julien Grall
On 14.10.2022 12:38, Henry Wang wrote: >> -----Original Message----- >> From: Julien Grall <julien@xen.org> >>>>> + if ( d->arch.paging.p2m_total_pages != 0 ) >>>>> + { >>>>> + spin_lock(&d->arch.paging.lock); >>>>> + p2m_set_allocation(d, 0, NULL); >>>>> + spin_unlock(&d->arch.paging.lock); >>>>> + ASSERT(d->arch.paging.p2m_total_pages == 0); >>>>> + } >>>> >>>> Is it intentional to largely open-code p2m_teardown_allocation() here? >>> >>> Yes, AFAICT p2m_teardown_allocation() is preemptible and we don't want >>> any preemption here. >> >> Like Jan, I would prefer if we can avoid the duplication. The loop >> suggested by Jan should work. > > I am a little bit worried about the -ENOMEM, if -ENOMEM is > returned from p2m_teardown_allocation(d), I think we are in > the infinite loop, or did I miss understood the loop that Jan referred > to? Where would -ENOMEM come from? We're firmly freeing memory here. -ENOMEM can only occur for a non-zero 2nd argument. Jan
Hi Jan, > -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in > arch_domain_create() > > On 14.10.2022 12:38, Henry Wang wrote: > >> -----Original Message----- > >> From: Julien Grall <julien@xen.org> > >>>>> + if ( d->arch.paging.p2m_total_pages != 0 ) > >>>>> + { > >>>>> + spin_lock(&d->arch.paging.lock); > >>>>> + p2m_set_allocation(d, 0, NULL); > >>>>> + spin_unlock(&d->arch.paging.lock); > >>>>> + ASSERT(d->arch.paging.p2m_total_pages == 0); > >>>>> + } > >>>> > >>>> Is it intentional to largely open-code p2m_teardown_allocation() here? > >>> > >>> Yes, AFAICT p2m_teardown_allocation() is preemptible and we don't > want > >>> any preemption here. > >> > >> Like Jan, I would prefer if we can avoid the duplication. The loop > >> suggested by Jan should work. > > > > I am a little bit worried about the -ENOMEM, if -ENOMEM is > > returned from p2m_teardown_allocation(d), I think we are in > > the infinite loop, or did I miss understood the loop that Jan referred > > to? > > Where would -ENOMEM come from? We're firmly freeing memory here. - > ENOMEM > can only occur for a non-zero 2nd argument. My initial thought is the "else if" part in p2m_set_allocation. It might be wrong. Would the code below seems ok to you? int err; do { err = p2m_teardown_allocation(d) } while ( err == -ERESTART ) Kind regards, Henry > > Jan
On 14.10.2022 12:53, Henry Wang wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> >> On 14.10.2022 12:38, Henry Wang wrote: >>>> -----Original Message----- >>>> From: Julien Grall <julien@xen.org> >>>>>>> + if ( d->arch.paging.p2m_total_pages != 0 ) >>>>>>> + { >>>>>>> + spin_lock(&d->arch.paging.lock); >>>>>>> + p2m_set_allocation(d, 0, NULL); >>>>>>> + spin_unlock(&d->arch.paging.lock); >>>>>>> + ASSERT(d->arch.paging.p2m_total_pages == 0); >>>>>>> + } >>>>>> >>>>>> Is it intentional to largely open-code p2m_teardown_allocation() here? >>>>> >>>>> Yes, AFAICT p2m_teardown_allocation() is preemptible and we don't >> want >>>>> any preemption here. >>>> >>>> Like Jan, I would prefer if we can avoid the duplication. The loop >>>> suggested by Jan should work. >>> >>> I am a little bit worried about the -ENOMEM, if -ENOMEM is >>> returned from p2m_teardown_allocation(d), I think we are in >>> the infinite loop, or did I miss understood the loop that Jan referred >>> to? >> >> Where would -ENOMEM come from? We're firmly freeing memory here. - >> ENOMEM >> can only occur for a non-zero 2nd argument. > > My initial thought is the "else if" part in p2m_set_allocation. It might be > wrong. Would the code below seems ok to you? > > int err; > > do { > err = p2m_teardown_allocation(d) > } while ( err == -ERESTART ) Sure, one of several ways of doing it. Jan
Hi Jan, > -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in > arch_domain_create() > > > My initial thought is the "else if" part in p2m_set_allocation. It might be > > wrong. Would the code below seems ok to you? > > > > int err; > > > > do { > > err = p2m_teardown_allocation(d) > > } while ( err == -ERESTART ) > > Sure, one of several ways of doing it. Thanks for your confirmation. Just to play safe if you have more simple Solutions please do raise it. It is a good opportunity for me to learn and personally I am not a big fan of either do-while or the introduced "err" which is used only by p2m_teardown_allocation(d), considering the p2m_final_teardown(d) has a void return type... Thanks for your patience in advance :) Kind regards, Henry > > Jan
Hi Julien, > -----Original Message----- > From: Julien Grall <julien@xen.org> > > Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in > arch_domain_create() > > Hi Henry, > > > > > -int p2m_teardown(struct domain *d) > > +int p2m_teardown(struct domain *d, bool allow_preemption) > > { > I think the part to clean & invalidate the root should not be necessary > if the domain is not scheduled. Similarly, I think we might only need to > do once by domain (rather than for every call). So I would consider to > move the logic outside of the function. > > That's not for 4.17 thought. Sure, I can prepare the follow up patch after 4.17 as (1) I am also worried about if this patch would become bigger and bigger (2) I checked you also want other things in your below comment. > > > struct p2m_domain *p2m = p2m_get_hostp2m(d); > > unsigned long count = 0; > > @@ -1716,7 +1716,7 @@ int p2m_teardown(struct domain *d) > > p2m_free_page(p2m->domain, pg); > > count++; > > /* Arbitrarily preempt every 512 iterations */ > > - if ( !(count % 512) && hypercall_preempt_check() ) > > + if ( allow_preemption && !(count % 512) && > hypercall_preempt_check() ) > > { > > rc = -ERESTART; > > break; > > @@ -1736,6 +1736,17 @@ void p2m_final_teardown(struct domain *d) > > if ( !p2m->domain ) > > return; > > > > + if ( !page_list_empty(&p2m->pages) ) > > Did you add this check to avoid the clean & invalidate if the list is empty? Yep. I think we only need the p2m_teardown() if we actually have something in p2m->pages list. > > > + p2m_teardown(d, false); > > Today, it should be fine to ignore p2m_teardown(). But I would prefer if > we add an ASSERT()/BUG_ON() (or else) to make confirm this is the case. Sorry I do not really understand why we can ignore the p2m_teardown() probably because of my English. Let's talk a bit more in C if you don't mind :)) Do you mean p2m_teardown() should be called here unconditionally without the if ( !page_list_empty(&p2m->pages) ) check? > > This also wants to be documented on top of p2m_teardown() as it would be > easier to know that the function should always return 0 when > !allow_preemption is not set. Ok, will do. > > I also noticed that relinquish_p2m_mapping() is not called. This should > be fine for us because arch_domain_create() should never create a > mapping that requires p2m_put_l3_page() to be called. > > I think it would be good to check it in __p2m_set_entry(). So we don't > end up to add such mappings by mistake. I thought for a while but failed to translate the above requirements to proper if conditions in __p2m_set_entry()... > > I would have suggested to add a comment only for version and send a > follow-up patch. But I don't exactly know where to put it. ...how about p2m_final_teardown(), we can use a TODO to explain why we don't need to call relinquish_p2m_mapping() and a following patch can fix this? > > > + > > + if ( d->arch.paging.p2m_total_pages != 0 ) > > + { > > + spin_lock(&d->arch.paging.lock); > > + p2m_set_allocation(d, 0, NULL); > > + spin_unlock(&d->arch.paging.lock); > > + ASSERT(d->arch.paging.p2m_total_pages == 0); > > + } > > + > > ASSERT(page_list_empty(&p2m->pages)); > > I would move this assert between the two ifs you added. Sure, will do in v3. Kind regards, Henry > > > ASSERT(page_list_empty(&d->arch.paging.p2m_freelist)); > > > > Cheers, > > -- > Julien Grall
On 14.10.2022 13:04, Henry Wang wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> >>> My initial thought is the "else if" part in p2m_set_allocation. It might be >>> wrong. Would the code below seems ok to you? >>> >>> int err; >>> >>> do { >>> err = p2m_teardown_allocation(d) >>> } while ( err == -ERESTART ) >> >> Sure, one of several ways of doing it. > > Thanks for your confirmation. Just to play safe if you have more simple > Solutions please do raise it. It is a good opportunity for me to learn and > personally I am not a big fan of either do-while or the introduced "err" > which is used only by p2m_teardown_allocation(d), considering the > p2m_final_teardown(d) has a void return type... Personally I would probably have written while ( p2m_teardown_allocation(d) == -ERESTART ) /* Nothing - no preemption support here. */; or while ( p2m_teardown_allocation(d) == -ERESTART ) continue; /* No preemption support here. */ . Otoh with the "err" variable you could ASSERT(!err) after the loop. Jan
Hi Jan, > -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in > arch_domain_create() > > On 14.10.2022 13:04, Henry Wang wrote: > >> -----Original Message----- > >> From: Jan Beulich <jbeulich@suse.com> > >> > >>> My initial thought is the "else if" part in p2m_set_allocation. It might be > >>> wrong. Would the code below seems ok to you? > >>> > >>> int err; > >>> > >>> do { > >>> err = p2m_teardown_allocation(d) > >>> } while ( err == -ERESTART ) > >> > >> Sure, one of several ways of doing it. > > > > Thanks for your confirmation. Just to play safe if you have more simple > > Solutions please do raise it. It is a good opportunity for me to learn and > > personally I am not a big fan of either do-while or the introduced "err" > > which is used only by p2m_teardown_allocation(d), considering the > > p2m_final_teardown(d) has a void return type... > > Personally I would probably have written > > while ( p2m_teardown_allocation(d) == -ERESTART ) > /* Nothing - no preemption support here. */; Thanks very much for the suggestions! I didn't think of the /* */ part and I really like this idea. This said, a quick search of different coding styles and I found [1] mentioned: "Empty loop bodies should use either empty braces or continue. " So I will probably follow... > > or > > while ( p2m_teardown_allocation(d) == -ERESTART ) > continue; /* No preemption support here. */ ...this way. Great experience of learning, thanks! [1] https://google.github.io/styleguide/cppguide.html Kind regards, Henry > > . Otoh with the "err" variable you could ASSERT(!err) after the loop. > > Jan
On 14/10/2022 12:19, Henry Wang wrote: > Hi Julien, Hi Henry, >> -----Original Message----- >> From: Julien Grall <julien@xen.org> >>> struct p2m_domain *p2m = p2m_get_hostp2m(d); >>> unsigned long count = 0; >>> @@ -1716,7 +1716,7 @@ int p2m_teardown(struct domain *d) >>> p2m_free_page(p2m->domain, pg); >>> count++; >>> /* Arbitrarily preempt every 512 iterations */ >>> - if ( !(count % 512) && hypercall_preempt_check() ) >>> + if ( allow_preemption && !(count % 512) && >> hypercall_preempt_check() ) >>> { >>> rc = -ERESTART; >>> break; >>> @@ -1736,6 +1736,17 @@ void p2m_final_teardown(struct domain *d) >>> if ( !p2m->domain ) >>> return; >>> >>> + if ( !page_list_empty(&p2m->pages) ) >> >> Did you add this check to avoid the clean & invalidate if the list is empty? > > Yep. I think we only need the p2m_teardown() if we actually have something > in p2m->pages list. How about adding the check in p2m_teardown()? So it will be easier to remember that the check can be dropped if we move the zeroing outside of the function. > >> >>> + p2m_teardown(d, false); >> >> Today, it should be fine to ignore p2m_teardown(). But I would prefer if >> we add an ASSERT()/BUG_ON() (or else) to make confirm this is the case. > > Sorry I do not really understand why we can ignore the p2m_teardown() > probably because of my English. No, I forgot a word in my sentence. I was meant to say that the return of p2m_teardown() can be ignored in our situation because it only return 0 or -ERESTART. The latter cannnot happen when the preemption is not enabled. But I would like to add some code (either ASSERT() or BUG_ON()) to confirm that p2m_teardown() will always return 0. > Let's talk a bit more in C if you don't mind :)) > Do you mean p2m_teardown() should be called here unconditionally without > the if ( !page_list_empty(&p2m->pages) ) check? See above. > >> >> This also wants to be documented on top of p2m_teardown() as it would be >> easier to know that the function should always return 0 when >> !allow_preemption is not set. > > Ok, will do. > >> >> I also noticed that relinquish_p2m_mapping() is not called. This should >> be fine for us because arch_domain_create() should never create a >> mapping that requires p2m_put_l3_page() to be called. >> >> I think it would be good to check it in __p2m_set_entry(). So we don't >> end up to add such mappings by mistake. > > I thought for a while but failed to translate the above requirements > to proper if conditions in __p2m_set_entry()... For checking the mapping, we can do: if ( !removing_mapping && (p2m_is_foreign(t) || (p2m_is_ram(t) && is_xenheap_mfn(mfn) ) return -EINVAL; We also need a way to check whether we are called from arch_domain_create(). I think we would need a field in the domain structure to indicate whether it is still initializating. This is a bit ugly though. Any other suggestions? > >> >> I would have suggested to add a comment only for version and send a >> follow-up patch. But I don't exactly know where to put it. > > ...how about p2m_final_teardown(), we can use a TODO to explain why > we don't need to call relinquish_p2m_mapping() and a following patch > can fix this? To me the TODO would make more sense on top of p2m_set_entry() because this is where the issue should be fixed. This is also where most of the reader will likely look if they want to understand how p2m_set_entry() can be used. We could also have a comment in p2m_final_teardown() stating that the relinquish function is not called because the P2M should not contain any mapping that requires specific operation when removed. This could point to the comment in p2m_set_entry(). Cheers,
Hi Julien, Thanks for reply and sharing your opinions! > -----Original Message----- > From: Julien Grall <julien@xen.org> > Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in > arch_domain_create() > On 14/10/2022 12:19, Henry Wang wrote: > > Hi Julien, > > Hi Henry, > > >> -----Original Message----- > >> From: Julien Grall <julien@xen.org> > >>> struct p2m_domain *p2m = p2m_get_hostp2m(d); > >>> unsigned long count = 0; > >>> @@ -1716,7 +1716,7 @@ int p2m_teardown(struct domain *d) > >>> p2m_free_page(p2m->domain, pg); > >>> count++; > >>> /* Arbitrarily preempt every 512 iterations */ > >>> - if ( !(count % 512) && hypercall_preempt_check() ) > >>> + if ( allow_preemption && !(count % 512) && > >> hypercall_preempt_check() ) > >>> { > >>> rc = -ERESTART; > >>> break; > >>> @@ -1736,6 +1736,17 @@ void p2m_final_teardown(struct domain *d) > >>> if ( !p2m->domain ) > >>> return; > >>> > >>> + if ( !page_list_empty(&p2m->pages) ) > >> > >> Did you add this check to avoid the clean & invalidate if the list is empty? > > > > Yep. I think we only need the p2m_teardown() if we actually have > something > > in p2m->pages list. > > How about adding the check in p2m_teardown()? So it will be easier to > remember that the check can be dropped if we move the zeroing outside of > the function. Yes, I will turn above if check to a if ( page_list_empty(&p2m->pages) ) return 0; in the beginning of the p2m_teardown(), and do the clean & invalidate follow-up after the release. > > > > >> > >>> + p2m_teardown(d, false); > >> > >> Today, it should be fine to ignore p2m_teardown(). But I would prefer if > >> we add an ASSERT()/BUG_ON() (or else) to make confirm this is the case. > > > > Sorry I do not really understand why we can ignore the p2m_teardown() > > probably because of my English. > > No, I forgot a word in my sentence. I was meant to say that the return > of p2m_teardown() can be ignored in our situation because it only return > 0 or -ERESTART. The latter cannnot happen when the preemption is not > enabled. > > But I would like to add some code (either ASSERT() or BUG_ON()) to > confirm that p2m_teardown() will always return 0. I added the doc asked in your previous email. Also, I will use a ASSERT(p2m_teardown(d, false) == 0); in p2m_final_teardown() here. > > > Let's talk a bit more in C if you don't mind :)) > > Do you mean p2m_teardown() should be called here unconditionally > without > > the if ( !page_list_empty(&p2m->pages) ) check? > > See above. Thanks. > > > > >> > >> This also wants to be documented on top of p2m_teardown() as it would > be > >> easier to know that the function should always return 0 when > >> !allow_preemption is not set. > > > > Ok, will do. > > > >> > >> I also noticed that relinquish_p2m_mapping() is not called. This should > >> be fine for us because arch_domain_create() should never create a > >> mapping that requires p2m_put_l3_page() to be called. > >> > >> I think it would be good to check it in __p2m_set_entry(). So we don't > >> end up to add such mappings by mistake. > > > > I thought for a while but failed to translate the above requirements > > to proper if conditions in __p2m_set_entry()... > > For checking the mapping, we can do: > > if ( !removing_mapping && (p2m_is_foreign(t) || (p2m_is_ram(t) && > is_xenheap_mfn(mfn) ) > return -EINVAL; Thanks for this, I guess without your hint it will take ages for me to think of this.... > > We also need a way to check whether we are called from > arch_domain_create(). I think we would need a field in the domain > structure to indicate whether it is still initializating. > > This is a bit ugly though. Any other suggestions? My first thought is checking the implementation of domain_create() and arch_domain_create() (as both will call arch_domain_destroy() when fail) to see if there are some fields in struct domain or struct arch_domain that are set/changed in this stage so probably we can reuse. Otherwise I think adding a new field sounds a good idea. > > > > >> > >> I would have suggested to add a comment only for version and send a > >> follow-up patch. But I don't exactly know where to put it. > > > > ...how about p2m_final_teardown(), we can use a TODO to explain why > > we don't need to call relinquish_p2m_mapping() and a following patch > > can fix this? > > To me the TODO would make more sense on top of p2m_set_entry() > because > this is where the issue should be fixed. This is also where most of the > reader will likely look if they want to understand how p2m_set_entry() > can be used. Good idea, thanks for the suggestion! > > We could also have a comment in p2m_final_teardown() stating that the > relinquish function is not called because the P2M should not contain any > mapping that requires specific operation when removed. This could point > to the comment in p2m_set_entry(). Yes, my current wording for this would be: + /* + * No need to call relinquish_p2m_mapping() here because + * p2m_final_teardown() is called either after domain_relinquish_resources() + * where relinquish_p2m_mapping() has been called, or from failure path of + * domain_create()/arch_domain_create() where mappings that require + * p2m_put_l3_page() should never be created. + */ I will add the words pointing to p2m_set_entry(). Kind regards, Henry > > Cheers, > > -- > Julien Grall
On 15.10.2022 15:14, Henry Wang wrote: > >> -----Original Message----- >> From: Julien Grall <julien@xen.org> >> On 14/10/2022 12:19, Henry Wang wrote: >> >>>> -----Original Message----- >>>> From: Julien Grall <julien@xen.org> >>>>> + p2m_teardown(d, false); >>>> >>>> Today, it should be fine to ignore p2m_teardown(). But I would prefer if >>>> we add an ASSERT()/BUG_ON() (or else) to make confirm this is the case. >>> >>> Sorry I do not really understand why we can ignore the p2m_teardown() >>> probably because of my English. >> >> No, I forgot a word in my sentence. I was meant to say that the return >> of p2m_teardown() can be ignored in our situation because it only return >> 0 or -ERESTART. The latter cannnot happen when the preemption is not >> enabled. >> >> But I would like to add some code (either ASSERT() or BUG_ON()) to >> confirm that p2m_teardown() will always return 0. > > I added the doc asked in your previous email. Also, I will use a > > ASSERT(p2m_teardown(d, false) == 0); > > in p2m_final_teardown() here. Hopefully this was meant only as an abstract plan, not the exact code you mean to add? ASSERT() expressions generally should not have side effects (which includes function calls). Jan
Hi Jan, > -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > > I added the doc asked in your previous email. Also, I will use a > > > > ASSERT(p2m_teardown(d, false) == 0); > > > > in p2m_final_teardown() here. > > Hopefully this was meant only as an abstract plan, not the exact code > you mean to add? ASSERT() expressions generally should not have side > effects (which includes function calls). Yeah, when I wrote the v3 code I noticed that ASSERT might be limited to the CONFIG_DEBUG so in the v3 I switched to BUG_ON which IIUC can make sure the function call is valid all the time. Kind regards, Henry > > Jan
On 17.10.2022 10:43, Henry Wang wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >>> I added the doc asked in your previous email. Also, I will use a >>> >>> ASSERT(p2m_teardown(d, false) == 0); >>> >>> in p2m_final_teardown() here. >> >> Hopefully this was meant only as an abstract plan, not the exact code >> you mean to add? ASSERT() expressions generally should not have side >> effects (which includes function calls). > > Yeah, when I wrote the v3 code I noticed that ASSERT might be limited > to the CONFIG_DEBUG so in the v3 I switched to BUG_ON which IIUC > can make sure the function call is valid all the time. But in the past we still recommended against doing so even with BUG_ON(). More recently Andrew (iirc) has voiced an opinion in the opposite direction, but I'm not aware of us actually having changed direction in this regard. Apart from that ASSERT() and BUG_ON() aren't meant to be freely interchanged - the the respective part of ./CODFING_STYLE. Jan
Hi Julien, Stefano, Bertrand, I am trying to work on the follow-up improvements about the Arm P2M code, and while trying to address the comment below, I noticed there was an unfinished discussion between me and Julien which I would like to continue and here opinions from all of you (if possible). > -----Original Message----- > From: Julien Grall <julien@xen.org> > Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in > arch_domain_create() > >> I also noticed that relinquish_p2m_mapping() is not called. This should > >> be fine for us because arch_domain_create() should never create a > >> mapping that requires p2m_put_l3_page() to be called. For the background of the discussion, this was about the failure path of arch_domain_create(), where we only call p2m_final_teardown() which does not call relinquish_p2m_mapping()... > >> > >> I think it would be good to check it in __p2m_set_entry(). So we don't > >> end up to add such mappings by mistake. > > For checking the mapping, we can do: > > if ( !removing_mapping && (p2m_is_foreign(t) || (p2m_is_ram(t) && > is_xenheap_mfn(mfn)) ) > return -EINVAL; > > We also need a way to check whether we are called from > arch_domain_create(). I think we would need a field in the domain > structure to indicate whether it is still initializating. > > This is a bit ugly though. Any other suggestions? ...and I agree with Julien that this check makes great sense and I will add this check in the follow up patch as discussed. However I am not really convinced this looks pretty, and I would like to hear opinions / suggestions about below code, does below code snippet seem plausible to you? diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c @@ -1043,6 +1043,10 @@ static int __p2m_set_entry(struct p2m_domain *p2m, */ ASSERT(target > 0 && target <= 3); + if ( !removing_mapping && d->arch.p2m.from_arch_domain_create && + (p2m_is_foreign(t) || (p2m_is_ram(t) && is_xenheap_mfn(mfn)) ) + return -EINVAL; + table = p2m_get_root_pointer(p2m, sgfn); diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c @@ -687,6 +687,8 @@ int arch_domain_create(struct domain *d, if ( (rc = p2m_init(d)) != 0 ) goto fail; + d->arch.p2m.from_arch_domain_create = true; + rc = -ENOMEM; if ( (d->shared_info = alloc_xenheap_pages(0, 0)) == NULL ) goto fail; @@ -751,6 +753,8 @@ int arch_domain_create(struct domain *d, if ( (rc = domain_vpci_init(d)) != 0 ) goto fail; + d->arch.p2m.from_arch_domain_create = false; + return 0; Kind regards, Henry
Hi Henry, On 08/12/2022 03:06, Henry Wang wrote: > I am trying to work on the follow-up improvements about the Arm P2M code, > and while trying to address the comment below, I noticed there was an unfinished > discussion between me and Julien which I would like to continue and here > opinions from all of you (if possible). > >> -----Original Message----- >> From: Julien Grall <julien@xen.org> >> Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in >> arch_domain_create() >>>> I also noticed that relinquish_p2m_mapping() is not called. This should >>>> be fine for us because arch_domain_create() should never create a >>>> mapping that requires p2m_put_l3_page() to be called. > > For the background of the discussion, this was about the failure path of > arch_domain_create(), where we only call p2m_final_teardown() which does > not call relinquish_p2m_mapping()... So all this mess with the P2M is necessary because we are mapping the GICv2 CPU interface in arch_domain_create(). I think we should consider to defer the mapping to later. Other than it makes the code simpler, it also means we could also the P2M root on the same numa node the domain is going to run (those information are passed later on). There is a couple of options: 1. Introduce a new hypercall to finish the initialization of a domain (e.g. allocating the P2M and map the GICv2 CPU interface). This option was briefly discussed with Jan (see [2])./ 2. Allocate the P2M when populate the P2M pool and defer the GICv2 CPU interface mapping until the first access (similar to how with deal with MMIO access for ACPI). I find the second option is neater but it has the drawback that it requires on more trap to the hypervisor and we can't report any mapping failure (which should not happen if the P2M was correctly sized). So I am leaning towards option 2. Any opinions? Cheers, [1] https://lore.kernel.org/xen-devel/6c07cdfc-888a-45bb-2077-528a809a62f4@suse.com/
On Fri, 9 Dec 2022, Julien Grall wrote: > Hi Henry, > > On 08/12/2022 03:06, Henry Wang wrote: > > I am trying to work on the follow-up improvements about the Arm P2M code, > > and while trying to address the comment below, I noticed there was an > > unfinished > > discussion between me and Julien which I would like to continue and here > > opinions from all of you (if possible). > > > > > -----Original Message----- > > > From: Julien Grall <julien@xen.org> > > > Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in > > > arch_domain_create() > > > > > I also noticed that relinquish_p2m_mapping() is not called. This > > > > > should > > > > > be fine for us because arch_domain_create() should never create a > > > > > mapping that requires p2m_put_l3_page() to be called. > > > > For the background of the discussion, this was about the failure path of > > arch_domain_create(), where we only call p2m_final_teardown() which does > > not call relinquish_p2m_mapping()... > So all this mess with the P2M is necessary because we are mapping the GICv2 > CPU interface in arch_domain_create(). I think we should consider to defer the > mapping to later. > > Other than it makes the code simpler, it also means we could also the P2M root > on the same numa node the domain is going to run (those information are passed > later on). > > There is a couple of options: > 1. Introduce a new hypercall to finish the initialization of a domain (e.g. > allocating the P2M and map the GICv2 CPU interface). This option was briefly > discussed with Jan (see [2])./ > 2. Allocate the P2M when populate the P2M pool and defer the GICv2 CPU > interface mapping until the first access (similar to how with deal with MMIO > access for ACPI). > > I find the second option is neater but it has the drawback that it requires on > more trap to the hypervisor and we can't report any mapping failure (which > should not happen if the P2M was correctly sized). So I am leaning towards > option 2. > > Any opinions? Option 1 is not great due to the extra hypercall. But I worry that option 2 might make things harder for safety because the mapping/initialization becomes "dynamic". I don't know if this is a valid concern. I would love to hear Bertrand's thoughts about it. Putting him in To:
On 09/12/2022 18:51, Julien Grall wrote: > Hi Henry, > > On 08/12/2022 03:06, Henry Wang wrote: >> I am trying to work on the follow-up improvements about the Arm P2M >> code, >> and while trying to address the comment below, I noticed there was an >> unfinished >> discussion between me and Julien which I would like to continue and here >> opinions from all of you (if possible). >> >>> -----Original Message----- >>> From: Julien Grall <julien@xen.org> >>> Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 >>> mapping in >>> arch_domain_create() >>>>> I also noticed that relinquish_p2m_mapping() is not called. This >>>>> should >>>>> be fine for us because arch_domain_create() should never create a >>>>> mapping that requires p2m_put_l3_page() to be called. >> >> For the background of the discussion, this was about the failure path of >> arch_domain_create(), where we only call p2m_final_teardown() which does >> not call relinquish_p2m_mapping()... > So all this mess with the P2M is necessary because we are mapping the > GICv2 CPU interface in arch_domain_create(). I think we should > consider to defer the mapping to later. > > Other than it makes the code simpler, it also means we could also the > P2M root on the same numa node the domain is going to run (those > information are passed later on). > > There is a couple of options: > 1. Introduce a new hypercall to finish the initialization of a domain > (e.g. allocating the P2M and map the GICv2 CPU interface). This option > was briefly discussed with Jan (see [2])./ > 2. Allocate the P2M when populate the P2M pool and defer the GICv2 > CPU interface mapping until the first access (similar to how with deal > with MMIO access for ACPI). > > I find the second option is neater but it has the drawback that it > requires on more trap to the hypervisor and we can't report any > mapping failure (which should not happen if the P2M was correctly > sized). So I am leaning towards option 2. > > Any opinions? A DOMCTL_creation_finished hypercall (name subject to improvement) is mandatory for encrypted VM support in x86 (there's crypto needed at this point to complete the measurement of the guest and create an attestation report), so we are going to gain something to this effect one way or another. Such a hypercall also removes the giant bodge of using domain_unpause_by_systemcontroller() for this purpose. But it seems like ARM already has arch_domain_creation_finished() so you can do 1) independently of adding a new hypercall. x86 uses that hook for hooking up the magic APICv sinc page into the p2m, so you're already in good(?) company with a GIC bodge. As to where the logic *should* be, it should be done in the hypercall(s) where the toolstack is describing the guests phymap to Xen. The fact that these don't exist is yet another problem needing to be worked on. That said, moving the creation side of things doesn't change the teardown requirements. When I get time to respin the fault_ttl series, Gitlab CI will be able to demonstrate the bug I keep on telling you is still there, the fix for which is taking the patch I already wrote for you. There is no correct way to move the final loop out of complete_domain_destroy(), even if in the general case you can make it more preemptible by moving the allocation later. ~Andrew
Hi Andrew, On 09/12/2022 22:18, Andrew Cooper wrote: > On 09/12/2022 18:51, Julien Grall wrote: > That said, moving the creation side of things doesn't change the > teardown requirements. When I get time to respin the fault_ttl series, > Gitlab CI will be able to demonstrate the bug I keep on telling you is > still there, the fix for which is taking the patch I already wrote for > you. Is it just a matter of rebasing the series? If so, I am happy to give a try to see what comes up... > There is no correct way to move the final loop out of > complete_domain_destroy(), even if in the general case you can make it > more preemptible by moving the allocation later. Hmmm... I don't think the call to p2m_teardown() in complete_domain_destroy() will be necessary once we move the P2M allocation out of arch_domain_create(). This is because there will be no use of the P2M after domain_relinquish_resources(). Cheers,
Hi,
Sorry for the delay but I have very limited access to my mails right now.
On 9 Dec 2022, at 23:11, Stefano Stabellini <sstabellini@kernel.org<mailto:sstabellini@kernel.org>> wrote:
On Fri, 9 Dec 2022, Julien Grall wrote:
Hi Henry,
On 08/12/2022 03:06, Henry Wang wrote:
I am trying to work on the follow-up improvements about the Arm P2M code,
and while trying to address the comment below, I noticed there was an
unfinished
discussion between me and Julien which I would like to continue and here
opinions from all of you (if possible).
-----Original Message-----
From: Julien Grall <julien@xen.org<mailto:julien@xen.org>>
Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in
arch_domain_create()
I also noticed that relinquish_p2m_mapping() is not called. This
should
be fine for us because arch_domain_create() should never create a
mapping that requires p2m_put_l3_page() to be called.
For the background of the discussion, this was about the failure path of
arch_domain_create(), where we only call p2m_final_teardown() which does
not call relinquish_p2m_mapping()...
So all this mess with the P2M is necessary because we are mapping the GICv2
CPU interface in arch_domain_create(). I think we should consider to defer the
mapping to later.
Other than it makes the code simpler, it also means we could also the P2M root
on the same numa node the domain is going to run (those information are passed
later on).
There is a couple of options:
1. Introduce a new hypercall to finish the initialization of a domain (e.g.
allocating the P2M and map the GICv2 CPU interface). This option was briefly
discussed with Jan (see [2])./
2. Allocate the P2M when populate the P2M pool and defer the GICv2 CPU
interface mapping until the first access (similar to how with deal with MMIO
access for ACPI).
I find the second option is neater but it has the drawback that it requires on
more trap to the hypervisor and we can't report any mapping failure (which
should not happen if the P2M was correctly sized). So I am leaning towards
option 2.
Any opinions?
Option 1 is not great due to the extra hypercall. But I worry that
option 2 might make things harder for safety because the
mapping/initialization becomes "dynamic". I don't know if this is a
valid concern.
I would love to hear Bertrand's thoughts about it. Putting him in To:
How option 1 would work for dom0less ?
Option 2 would make safety more challenging but not impossible (we have a lot of other use cases where we cannot map everything on boot).
I would vote for option 2 as I think we will not certify gicv2 and it is not adding an other hyper call.
Cheers
Bertrand
Hi Bertrand, On 03/01/2023 12:34, Bertrand Marquis wrote: > Hi, > > Sorry for the delay but I have very limited access to my mails right now. > >> On 9 Dec 2022, at 23:11, Stefano Stabellini <sstabellini@kernel.org >> <mailto:sstabellini@kernel.org>> wrote: >> >> On Fri, 9 Dec 2022, Julien Grall wrote: >>> Hi Henry, >>> >>> On 08/12/2022 03:06, Henry Wang wrote: >>>> I am trying to work on the follow-up improvements about the Arm P2M >>>> code, >>>> and while trying to address the comment below, I noticed there was an >>>> unfinished >>>> discussion between me and Julien which I would like to continue and here >>>> opinions from all of you (if possible). >>>> >>>>> -----Original Message----- >>>>> From: Julien Grall <julien@xen.org <mailto:julien@xen.org>> >>>>> Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 >>>>> mapping in >>>>> arch_domain_create() >>>>>>> I also noticed that relinquish_p2m_mapping() is not called. This >>>>>>> should >>>>>>> be fine for us because arch_domain_create() should never create a >>>>>>> mapping that requires p2m_put_l3_page() to be called. >>>> >>>> For the background of the discussion, this was about the failure path of >>>> arch_domain_create(), where we only call p2m_final_teardown() which does >>>> not call relinquish_p2m_mapping()... >>> So all this mess with the P2M is necessary because we are mapping the >>> GICv2 >>> CPU interface in arch_domain_create(). I think we should consider to >>> defer the >>> mapping to later. >>> >>> Other than it makes the code simpler, it also means we could also the >>> P2M root >>> on the same numa node the domain is going to run (those information >>> are passed >>> later on). >>> >>> There is a couple of options: >>> 1. Introduce a new hypercall to finish the initialization of a domain >>> (e.g. >>> allocating the P2M and map the GICv2 CPU interface). This option was >>> briefly >>> discussed with Jan (see [2])./ >>> 2. Allocate the P2M when populate the P2M pool and defer the GICv2 CPU >>> interface mapping until the first access (similar to how with deal >>> with MMIO >>> access for ACPI). >>> >>> I find the second option is neater but it has the drawback that it >>> requires on >>> more trap to the hypervisor and we can't report any mapping failure >>> (which >>> should not happen if the P2M was correctly sized). So I am leaning >>> towards >>> option 2. >>> >>> Any opinions? >> >> Option 1 is not great due to the extra hypercall. But I worry that >> option 2 might make things harder for safety because the >> mapping/initialization becomes "dynamic". I don't know if this is a >> valid concern. >> >> I would love to hear Bertrand's thoughts about it. Putting him in To: > > How option 1 would work for dom0less ? Xen would call the function directly. Effectively, this would the same as all the other "hypercalls" we are using to build a dom0less domain. > > Option 2 would make safety more challenging but not impossible (we have > a lot of other use cases where we cannot map everything on boot). > > I would vote for option 2 as I think we will not certify gicv2 and it is > not adding an other hyper call. It sounds like option 2 is the preferred way for now. Henry, can you have a look? Cheers,
Hi Julien and Bertrand, > -----Original Message----- > From: Julien Grall <julien@xen.org> > Subject: Re: [PATCH v2] xen/arm: p2m: Populate pages for GICv2 mapping in > arch_domain_create() > > Hi Bertrand, > > On 03/01/2023 12:34, Bertrand Marquis wrote: > > Hi, > > > > Sorry for the delay but I have very limited access to my mails right now. > > > >> On 9 Dec 2022, at 23:11, Stefano Stabellini <sstabellini@kernel.org > >> <mailto:sstabellini@kernel.org>> wrote: > >> > >> On Fri, 9 Dec 2022, Julien Grall wrote: > >>> Hi Henry, > >>> > >>> On 08/12/2022 03:06, Henry Wang wrote: > >>>> I am trying to work on the follow-up improvements about the Arm P2M > >>>> code, > >>>> and while trying to address the comment below, I noticed there was an > >>>> unfinished > >>>> discussion between me and Julien which I would like to continue and > here > >>>> opinions from all of you (if possible). > >>>> > >>>> For the background of the discussion, this was about the failure path of > >>>> arch_domain_create(), where we only call p2m_final_teardown() which > does > >>>> not call relinquish_p2m_mapping()... > >>> So all this mess with the P2M is necessary because we are mapping the > >>> GICv2 > >>> CPU interface in arch_domain_create(). I think we should consider to > >>> defer the > >>> mapping to later. > >>> > >>> Other than it makes the code simpler, it also means we could also the > >>> P2M root > >>> on the same numa node the domain is going to run (those information > >>> are passed > >>> later on). > >>> > >>> There is a couple of options: > >>> 1. Introduce a new hypercall to finish the initialization of a domain > >>> (e.g. > >>> allocating the P2M and map the GICv2 CPU interface). This option was > >>> briefly > >>> discussed with Jan (see [2])./ > >>> 2. Allocate the P2M when populate the P2M pool and defer the GICv2 > CPU > >>> interface mapping until the first access (similar to how with deal > >>> with MMIO > >>> access for ACPI). > >>> > >>> I find the second option is neater but it has the drawback that it > >>> requires on > >>> more trap to the hypervisor and we can't report any mapping failure > >>> (which > >>> should not happen if the P2M was correctly sized). So I am leaning > >>> towards > >>> option 2. > >>> > >>> Any opinions? > >> > >> Option 1 is not great due to the extra hypercall. But I worry that > >> option 2 might make things harder for safety because the > >> mapping/initialization becomes "dynamic". I don't know if this is a > >> valid concern. > >> > >> I would love to hear Bertrand's thoughts about it. Putting him in To: > > > > How option 1 would work for dom0less ? > > Xen would call the function directly. Effectively, this would the same > as all the other "hypercalls" we are using to build a dom0less domain. > > > > > Option 2 would make safety more challenging but not impossible (we have > > a lot of other use cases where we cannot map everything on boot). > > > > I would vote for option 2 as I think we will not certify gicv2 and it is > > not adding an other hyper call. Thanks for the input. > > It sounds like option 2 is the preferred way for now. Henry, can you > have a look? Yes, I would love to prepare some patches to follow the option 2. Kind regards, Henry > > Cheers, > > -- > Julien Grall
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 2c84e6dbbb..831e248ad7 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -740,6 +740,20 @@ int arch_domain_create(struct domain *d, BUG(); } + /* + * Hardware using GICv2 needs to create a P2M mapping of 8KB GICv2 area + * when the domain is created. Considering the worst case for page + * tables and keep a buffer, populate 16 pages to the P2M pages pool here. + * For GICv3, the above-mentioned P2M mapping is not necessary, but since + * the allocated 16 pages here would not be lost, hence populate these + * pages unconditionally. + */ + spin_lock(&d->arch.paging.lock); + rc = p2m_set_allocation(d, 16, NULL); + spin_unlock(&d->arch.paging.lock); + if ( rc != 0 ) + goto fail; + if ( (rc = domain_vgic_register(d, &count)) != 0 ) goto fail; @@ -1064,7 +1078,7 @@ int domain_relinquish_resources(struct domain *d) return ret; PROGRESS(p2m): - ret = p2m_teardown(d); + ret = p2m_teardown(d, true); if ( ret ) return ret; diff --git a/xen/arch/arm/include/asm/p2m.h b/xen/arch/arm/include/asm/p2m.h index 42bfd548c4..480d65e95e 100644 --- a/xen/arch/arm/include/asm/p2m.h +++ b/xen/arch/arm/include/asm/p2m.h @@ -194,14 +194,17 @@ int p2m_init(struct domain *d); /* * The P2M resources are freed in two parts: - * - p2m_teardown() will be called when relinquish the resources. It - * will free large resources (e.g. intermediate page-tables) that - * requires preemption. + * - p2m_teardown() will be called preemptively when relinquish the + * resources, in which case it will free large resources (e.g. intermediate + * page-tables) that requires preemption. * - p2m_final_teardown() will be called when domain struct is been * freed. This *cannot* be preempted and therefore one small * resources should be freed here. + * Note that p2m_final_teardown() will also call p2m_teardown(), to properly + * free the P2M when failures happen in the domain creation with P2M pages + * already in use. In this case p2m_teardown() is called non-preemptively. */ -int p2m_teardown(struct domain *d); +int p2m_teardown(struct domain *d, bool allow_preemption); void p2m_final_teardown(struct domain *d); /* diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index f17500ddf3..707bd3e2e3 100644 --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -1685,7 +1685,7 @@ static void p2m_free_vmid(struct domain *d) spin_unlock(&vmid_alloc_lock); } -int p2m_teardown(struct domain *d) +int p2m_teardown(struct domain *d, bool allow_preemption) { struct p2m_domain *p2m = p2m_get_hostp2m(d); unsigned long count = 0; @@ -1716,7 +1716,7 @@ int p2m_teardown(struct domain *d) p2m_free_page(p2m->domain, pg); count++; /* Arbitrarily preempt every 512 iterations */ - if ( !(count % 512) && hypercall_preempt_check() ) + if ( allow_preemption && !(count % 512) && hypercall_preempt_check() ) { rc = -ERESTART; break; @@ -1736,6 +1736,17 @@ void p2m_final_teardown(struct domain *d) if ( !p2m->domain ) return; + if ( !page_list_empty(&p2m->pages) ) + p2m_teardown(d, false); + + if ( d->arch.paging.p2m_total_pages != 0 ) + { + spin_lock(&d->arch.paging.lock); + p2m_set_allocation(d, 0, NULL); + spin_unlock(&d->arch.paging.lock); + ASSERT(d->arch.paging.p2m_total_pages == 0); + } + ASSERT(page_list_empty(&p2m->pages)); ASSERT(page_list_empty(&d->arch.paging.p2m_freelist));
Hardware using GICv2 needs to create a P2M mapping of 8KB GICv2 area when the domain is created. Considering the worst case of page tables which requires 6 P2M pages as the two pages will be consecutive but not necessarily in the same L3 page table and keep a buffer, populate 16 pages as the default value to the P2M pages pool in arch_domain_create() at the domain creation stage to satisfy the GICv2 requirement. For GICv3, the above-mentioned P2M mapping is not necessary, but since the allocated 16 pages here would not be lost, hence populate these pages unconditionally. With the default 16 P2M pages populated, there would be a case that failures would happen in the domain creation with P2M pages already in use. To properly free the P2M for this case, firstly support the optionally preemption of p2m_teardown(), then call p2m_teardown() and p2m_set_allocation(d, 0, NULL) in p2m_final_teardown() if needed. Fixes: cbea5a1149ca ("xen/arm: Allocate and free P2M pages from the P2M pool") Suggested-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Henry Wang <Henry.Wang@arm.com> --- This should also be backported to 4.13, 4.14, 4.15 and 4.16. v2 changes: - Move the p2m_set_allocation(d, 0, NULL); to p2m_final_teardown(). - Support optionally preemption of p2m_teardown(), and make the calling of p2m_teardown() preemptively when relinquish the resources, non-preemptively in p2m_final_teardown(). - Refactor the error handling to make the code use less spin_unlock. - Explain the worst case of page tables and the unconditional population of pages in commit message. - Mention the unconditional population of pages in in-code comment. --- xen/arch/arm/domain.c | 16 +++++++++++++++- xen/arch/arm/include/asm/p2m.h | 11 +++++++---- xen/arch/arm/p2m.c | 15 +++++++++++++-- 3 files changed, 35 insertions(+), 7 deletions(-)