Message ID | 20250304111000.9252-1-alejandro.vallejo@cloud.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] xen/page_alloc: Simplify domain_adjust_tot_pages | expand |
On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote: > The logic has too many levels of indirection and it's very hard to > understand it its current form. Split it between the corner case where > the adjustment is bigger than the current claim and the rest to avoid 5 > auxiliary variables. > > Add a functional change to prevent negative adjustments from > re-increasing the claim. This has the nice side effect of avoiding > taking the heap lock here on every free. > > While at it, fix incorrect field name in nearby comment. > > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> I think it would be nice to also ensure that once the domain is finished building (maybe when it's unpaused for the first time?) d->outstanding_pages is set to 0. IMO the claim system was designed to avoid races during domain building, and shouldn't be used once the domain is already running. Thanks, Roger.
On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote: > On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote: > > The logic has too many levels of indirection and it's very hard to > > understand it its current form. Split it between the corner case where > > the adjustment is bigger than the current claim and the rest to avoid 5 > > auxiliary variables. > > > > Add a functional change to prevent negative adjustments from > > re-increasing the claim. This has the nice side effect of avoiding > > taking the heap lock here on every free. > > > > While at it, fix incorrect field name in nearby comment. > > > > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> > > Acked-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. > I think it would be nice to also ensure that once the domain is > finished building (maybe when it's unpaused for the first > time?) d->outstanding_pages is set to 0. IMO the claim system was > designed to avoid races during domain building, and shouldn't be used > once the domain is already running. > > Thanks, Roger. As a matter of implementation that's already the case by toolstack being "nice" and unconditionally clearing claims after populating the physmap. However, I agree the hypervisor should do it on its own. I didn't find a suitable place for it. It may be possible to do it on every unpause with minimal overhead because it doesn't need to take the heap lock unless there are outstanding claims on the domains. Would that sound like an ok idea? I'd rather not add even more state into "struct domain" to count pauses... Cheers, Alejandro
On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote: > On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote: > > On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote: > > > The logic has too many levels of indirection and it's very hard to > > > understand it its current form. Split it between the corner case where > > > the adjustment is bigger than the current claim and the rest to avoid 5 > > > auxiliary variables. > > > > > > Add a functional change to prevent negative adjustments from > > > re-increasing the claim. This has the nice side effect of avoiding > > > taking the heap lock here on every free. > > > > > > While at it, fix incorrect field name in nearby comment. > > > > > > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> > > > > Acked-by: Roger Pau Monné <roger.pau@citrix.com> > > Thanks. > > > I think it would be nice to also ensure that once the domain is > > finished building (maybe when it's unpaused for the first > > time?) d->outstanding_pages is set to 0. IMO the claim system was > > designed to avoid races during domain building, and shouldn't be used > > once the domain is already running. > > > > Thanks, Roger. > > As a matter of implementation that's already the case by toolstack being "nice" > and unconditionally clearing claims after populating the physmap. I see. Another option would be to refuse the unpause a domain if it still has pending claims. However I don't know how that will work out with all possible toolstacks. > However, I agree the hypervisor should do it on its own. I didn't find a > suitable place for it. You could do it in arch_domain_creation_finished(). > It may be possible to do it on every unpause with > minimal overhead because it doesn't need to take the heap lock unless there are > outstanding claims on the domains. Would that sound like an ok idea? I'd rather > not add even more state into "struct domain" to count pauses... There's another side missing IMO, XENMEM_claim_pages should return an error (and refuse the set any claims) once d->creation_finished == true. Thanks, Roger.
On 11.03.2025 16:42, Roger Pau Monné wrote: > On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote: >> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote: >>> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote: >>>> The logic has too many levels of indirection and it's very hard to >>>> understand it its current form. Split it between the corner case where >>>> the adjustment is bigger than the current claim and the rest to avoid 5 >>>> auxiliary variables. >>>> >>>> Add a functional change to prevent negative adjustments from >>>> re-increasing the claim. This has the nice side effect of avoiding >>>> taking the heap lock here on every free. >>>> >>>> While at it, fix incorrect field name in nearby comment. >>>> >>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> >>> >>> Acked-by: Roger Pau Monné <roger.pau@citrix.com> >> >> Thanks. >> >>> I think it would be nice to also ensure that once the domain is >>> finished building (maybe when it's unpaused for the first >>> time?) d->outstanding_pages is set to 0. IMO the claim system was >>> designed to avoid races during domain building, and shouldn't be used >>> once the domain is already running. >>> >>> Thanks, Roger. >> >> As a matter of implementation that's already the case by toolstack being "nice" >> and unconditionally clearing claims after populating the physmap. > > I see. Another option would be to refuse the unpause a domain if it > still has pending claims. However I don't know how that will work out > with all possible toolstacks. > >> However, I agree the hypervisor should do it on its own. I didn't find a >> suitable place for it. > > You could do it in arch_domain_creation_finished(). Except that better wouldn't be arch-specific. Jan
On Tue, Mar 11, 2025 at 04:45:04PM +0100, Jan Beulich wrote: > On 11.03.2025 16:42, Roger Pau Monné wrote: > > On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote: > >> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote: > >>> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote: > >>>> The logic has too many levels of indirection and it's very hard to > >>>> understand it its current form. Split it between the corner case where > >>>> the adjustment is bigger than the current claim and the rest to avoid 5 > >>>> auxiliary variables. > >>>> > >>>> Add a functional change to prevent negative adjustments from > >>>> re-increasing the claim. This has the nice side effect of avoiding > >>>> taking the heap lock here on every free. > >>>> > >>>> While at it, fix incorrect field name in nearby comment. > >>>> > >>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> > >>> > >>> Acked-by: Roger Pau Monné <roger.pau@citrix.com> > >> > >> Thanks. > >> > >>> I think it would be nice to also ensure that once the domain is > >>> finished building (maybe when it's unpaused for the first > >>> time?) d->outstanding_pages is set to 0. IMO the claim system was > >>> designed to avoid races during domain building, and shouldn't be used > >>> once the domain is already running. > >>> > >>> Thanks, Roger. > >> > >> As a matter of implementation that's already the case by toolstack being "nice" > >> and unconditionally clearing claims after populating the physmap. > > > > I see. Another option would be to refuse the unpause a domain if it > > still has pending claims. However I don't know how that will work out > > with all possible toolstacks. > > > >> However, I agree the hypervisor should do it on its own. I didn't find a > >> suitable place for it. > > > > You could do it in arch_domain_creation_finished(). > > Except that better wouldn't be arch-specific. Sorry, I should better have said: you could do it at the same place where arch_domain_creation_finished() gets called in domain_unpause_by_systemcontroller(). Thanks, Roger.
On Tue Mar 11, 2025 at 3:45 PM GMT, Jan Beulich wrote: > On 11.03.2025 16:42, Roger Pau Monné wrote: > > On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote: > >> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote: > >>> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote: > >>>> The logic has too many levels of indirection and it's very hard to > >>>> understand it its current form. Split it between the corner case where > >>>> the adjustment is bigger than the current claim and the rest to avoid 5 > >>>> auxiliary variables. > >>>> > >>>> Add a functional change to prevent negative adjustments from > >>>> re-increasing the claim. This has the nice side effect of avoiding > >>>> taking the heap lock here on every free. > >>>> > >>>> While at it, fix incorrect field name in nearby comment. > >>>> > >>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> > >>> > >>> Acked-by: Roger Pau Monné <roger.pau@citrix.com> > >> > >> Thanks. > >> > >>> I think it would be nice to also ensure that once the domain is > >>> finished building (maybe when it's unpaused for the first > >>> time?) d->outstanding_pages is set to 0. IMO the claim system was > >>> designed to avoid races during domain building, and shouldn't be used > >>> once the domain is already running. > >>> > >>> Thanks, Roger. > >> > >> As a matter of implementation that's already the case by toolstack being "nice" > >> and unconditionally clearing claims after populating the physmap. > > > > I see. Another option would be to refuse the unpause a domain if it > > still has pending claims. However I don't know how that will work out > > with all possible toolstacks. > > > >> However, I agree the hypervisor should do it on its own. I didn't find a > >> suitable place for it. > > > > You could do it in arch_domain_creation_finished(). > > Except that better wouldn't be arch-specific. > > Jan Why would it have to be arch-specific though? As far as the hypervisor is concerned, it doesn't seem to be. Cheers, Alejandro
On 11.03.2025 19:35, Alejandro Vallejo wrote: > On Tue Mar 11, 2025 at 3:45 PM GMT, Jan Beulich wrote: >> On 11.03.2025 16:42, Roger Pau Monné wrote: >>> On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote: >>>> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote: >>>>> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote: >>>>>> The logic has too many levels of indirection and it's very hard to >>>>>> understand it its current form. Split it between the corner case where >>>>>> the adjustment is bigger than the current claim and the rest to avoid 5 >>>>>> auxiliary variables. >>>>>> >>>>>> Add a functional change to prevent negative adjustments from >>>>>> re-increasing the claim. This has the nice side effect of avoiding >>>>>> taking the heap lock here on every free. >>>>>> >>>>>> While at it, fix incorrect field name in nearby comment. >>>>>> >>>>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> >>>>> >>>>> Acked-by: Roger Pau Monné <roger.pau@citrix.com> >>>> >>>> Thanks. >>>> >>>>> I think it would be nice to also ensure that once the domain is >>>>> finished building (maybe when it's unpaused for the first >>>>> time?) d->outstanding_pages is set to 0. IMO the claim system was >>>>> designed to avoid races during domain building, and shouldn't be used >>>>> once the domain is already running. >>>>> >>>>> Thanks, Roger. >>>> >>>> As a matter of implementation that's already the case by toolstack being "nice" >>>> and unconditionally clearing claims after populating the physmap. >>> >>> I see. Another option would be to refuse the unpause a domain if it >>> still has pending claims. However I don't know how that will work out >>> with all possible toolstacks. >>> >>>> However, I agree the hypervisor should do it on its own. I didn't find a >>>> suitable place for it. >>> >>> You could do it in arch_domain_creation_finished(). >> >> Except that better wouldn't be arch-specific. > > Why would it have to be arch-specific though? As far as the hypervisor is > concerned, it doesn't seem to be. Together with Roger's earlier clarification on his original remark, I fear I don't understand the question: I asked that it not be arch-specific. And Roger clarified that he also didn't mean it to be. Jan
On Wed Mar 12, 2025 at 8:21 AM GMT, Jan Beulich wrote: > On 11.03.2025 19:35, Alejandro Vallejo wrote: > > On Tue Mar 11, 2025 at 3:45 PM GMT, Jan Beulich wrote: > >> On 11.03.2025 16:42, Roger Pau Monné wrote: > >>> On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote: > >>>> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote: > >>>>> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote: > >>>>>> The logic has too many levels of indirection and it's very hard to > >>>>>> understand it its current form. Split it between the corner case where > >>>>>> the adjustment is bigger than the current claim and the rest to avoid 5 > >>>>>> auxiliary variables. > >>>>>> > >>>>>> Add a functional change to prevent negative adjustments from > >>>>>> re-increasing the claim. This has the nice side effect of avoiding > >>>>>> taking the heap lock here on every free. > >>>>>> > >>>>>> While at it, fix incorrect field name in nearby comment. > >>>>>> > >>>>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> > >>>>> > >>>>> Acked-by: Roger Pau Monné <roger.pau@citrix.com> > >>>> > >>>> Thanks. > >>>> > >>>>> I think it would be nice to also ensure that once the domain is > >>>>> finished building (maybe when it's unpaused for the first > >>>>> time?) d->outstanding_pages is set to 0. IMO the claim system was > >>>>> designed to avoid races during domain building, and shouldn't be used > >>>>> once the domain is already running. > >>>>> > >>>>> Thanks, Roger. > >>>> > >>>> As a matter of implementation that's already the case by toolstack being "nice" > >>>> and unconditionally clearing claims after populating the physmap. > >>> > >>> I see. Another option would be to refuse the unpause a domain if it > >>> still has pending claims. However I don't know how that will work out > >>> with all possible toolstacks. > >>> > >>>> However, I agree the hypervisor should do it on its own. I didn't find a > >>>> suitable place for it. > >>> > >>> You could do it in arch_domain_creation_finished(). > >> > >> Except that better wouldn't be arch-specific. > > > > Why would it have to be arch-specific though? As far as the hypervisor is > > concerned, it doesn't seem to be. > > Together with Roger's earlier clarification on his original remark, I fear > I don't understand the question: I asked that it not be arch-specific. And > Roger clarified that he also didn't mean it to be. > > Jan Bah, I misread you (and my IMAP server annoyingly decided enough was enough and declared Roger's answer was one too many and deferred sending it until much later). Too much going on, too little attention. Apologies for the noise. I'll send a patch at some point with that adjustment. Cheers, Alejandro
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 1bf070c8c5df..5f9c9305ef37 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -490,31 +490,30 @@ static long outstanding_claims; /* total outstanding claims by all domains */ unsigned long domain_adjust_tot_pages(struct domain *d, long pages) { - long dom_before, dom_after, dom_claimed, sys_before, sys_after; - ASSERT(rspin_is_locked(&d->page_alloc_lock)); d->tot_pages += pages; /* - * can test d->claimed_pages race-free because it can only change + * can test d->outstanding_pages race-free because it can only change * if d->page_alloc_lock and heap_lock are both held, see also * domain_set_outstanding_pages below */ - if ( !d->outstanding_pages ) + if ( !d->outstanding_pages || pages <= 0 ) goto out; spin_lock(&heap_lock); - /* adjust domain outstanding pages; may not go negative */ - dom_before = d->outstanding_pages; - dom_after = dom_before - pages; - BUG_ON(dom_before < 0); - dom_claimed = dom_after < 0 ? 0 : dom_after; - d->outstanding_pages = dom_claimed; - /* flag accounting bug if system outstanding_claims would go negative */ - sys_before = outstanding_claims; - sys_after = sys_before - (dom_before - dom_claimed); - BUG_ON(sys_after < 0); - outstanding_claims = sys_after; + BUG_ON(outstanding_claims < d->outstanding_pages); + if ( d->outstanding_pages < pages ) + { + /* `pages` exceeds the domain's outstanding count. Zero it out. */ + outstanding_claims -= d->outstanding_pages; + d->outstanding_pages = 0; + } + else + { + outstanding_claims -= pages; + d->outstanding_pages -= pages; + } spin_unlock(&heap_lock); out:
The logic has too many levels of indirection and it's very hard to understand it its current form. Split it between the corner case where the adjustment is bigger than the current claim and the rest to avoid 5 auxiliary variables. Add a functional change to prevent negative adjustments from re-increasing the claim. This has the nice side effect of avoiding taking the heap lock here on every free. While at it, fix incorrect field name in nearby comment. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v1->v2: * Functional change by preventing negative adjustments from increasing the claim. Exit early on pages <= 0. * Reflect said functional change in commit message. --- xen/common/page_alloc.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-)