diff mbox series

[v2] xen/page_alloc: Simplify domain_adjust_tot_pages

Message ID 20250304111000.9252-1-alejandro.vallejo@cloud.com (mailing list archive)
State New
Headers show
Series [v2] xen/page_alloc: Simplify domain_adjust_tot_pages | expand

Commit Message

Alejandro Vallejo March 4, 2025, 11:10 a.m. UTC
The logic has too many levels of indirection and it's very hard to
understand it its current form. Split it between the corner case where
the adjustment is bigger than the current claim and the rest to avoid 5
auxiliary variables.

Add a functional change to prevent negative adjustments from
re-increasing the claim. This has the nice side effect of avoiding
taking the heap lock here on every free.

While at it, fix incorrect field name in nearby comment.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v1->v2:
  * Functional change by preventing negative adjustments from increasing
    the claim. Exit early on pages <= 0.
  * Reflect said functional change in commit message.
---
 xen/common/page_alloc.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

Comments

Roger Pau Monné March 11, 2025, 12:46 p.m. UTC | #1
On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote:
> The logic has too many levels of indirection and it's very hard to
> understand it its current form. Split it between the corner case where
> the adjustment is bigger than the current claim and the rest to avoid 5
> auxiliary variables.
> 
> Add a functional change to prevent negative adjustments from
> re-increasing the claim. This has the nice side effect of avoiding
> taking the heap lock here on every free.
> 
> While at it, fix incorrect field name in nearby comment.
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

I think it would be nice to also ensure that once the domain is
finished building (maybe when it's unpaused for the first
time?) d->outstanding_pages is set to 0.  IMO the claim system was
designed to avoid races during domain building, and shouldn't be used
once the domain is already running.

Thanks, Roger.
Alejandro Vallejo March 11, 2025, 2:53 p.m. UTC | #2
On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote:
> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote:
> > The logic has too many levels of indirection and it's very hard to
> > understand it its current form. Split it between the corner case where
> > the adjustment is bigger than the current claim and the rest to avoid 5
> > auxiliary variables.
> > 
> > Add a functional change to prevent negative adjustments from
> > re-increasing the claim. This has the nice side effect of avoiding
> > taking the heap lock here on every free.
> > 
> > While at it, fix incorrect field name in nearby comment.
> > 
> > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>
> Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

> I think it would be nice to also ensure that once the domain is
> finished building (maybe when it's unpaused for the first
> time?) d->outstanding_pages is set to 0.  IMO the claim system was
> designed to avoid races during domain building, and shouldn't be used
> once the domain is already running.
>
> Thanks, Roger.

As a matter of implementation that's already the case by toolstack being "nice"
and unconditionally clearing claims after populating the physmap.

However, I agree the hypervisor should do it on its own. I didn't find a
suitable place for it. It may be possible to do it on every unpause with
minimal overhead because it doesn't need to take the heap lock unless there are
outstanding claims on the domains. Would that sound like an ok idea? I'd rather
not add even more state into "struct domain" to count pauses...

Cheers,
Alejandro
Roger Pau Monné March 11, 2025, 3:42 p.m. UTC | #3
On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote:
> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote:
> > On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote:
> > > The logic has too many levels of indirection and it's very hard to
> > > understand it its current form. Split it between the corner case where
> > > the adjustment is bigger than the current claim and the rest to avoid 5
> > > auxiliary variables.
> > > 
> > > Add a functional change to prevent negative adjustments from
> > > re-increasing the claim. This has the nice side effect of avoiding
> > > taking the heap lock here on every free.
> > > 
> > > While at it, fix incorrect field name in nearby comment.
> > > 
> > > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> >
> > Acked-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks.
> 
> > I think it would be nice to also ensure that once the domain is
> > finished building (maybe when it's unpaused for the first
> > time?) d->outstanding_pages is set to 0.  IMO the claim system was
> > designed to avoid races during domain building, and shouldn't be used
> > once the domain is already running.
> >
> > Thanks, Roger.
> 
> As a matter of implementation that's already the case by toolstack being "nice"
> and unconditionally clearing claims after populating the physmap.

I see.  Another option would be to refuse the unpause a domain if it
still has pending claims.  However I don't know how that will work out
with all possible toolstacks.

> However, I agree the hypervisor should do it on its own. I didn't find a
> suitable place for it. 

You could do it in arch_domain_creation_finished().

> It may be possible to do it on every unpause with
> minimal overhead because it doesn't need to take the heap lock unless there are
> outstanding claims on the domains. Would that sound like an ok idea? I'd rather
> not add even more state into "struct domain" to count pauses...

There's another side missing IMO, XENMEM_claim_pages should return an
error (and refuse the set any claims) once d->creation_finished ==
true.

Thanks, Roger.
Jan Beulich March 11, 2025, 3:45 p.m. UTC | #4
On 11.03.2025 16:42, Roger Pau Monné wrote:
> On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote:
>> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote:
>>> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote:
>>>> The logic has too many levels of indirection and it's very hard to
>>>> understand it its current form. Split it between the corner case where
>>>> the adjustment is bigger than the current claim and the rest to avoid 5
>>>> auxiliary variables.
>>>>
>>>> Add a functional change to prevent negative adjustments from
>>>> re-increasing the claim. This has the nice side effect of avoiding
>>>> taking the heap lock here on every free.
>>>>
>>>> While at it, fix incorrect field name in nearby comment.
>>>>
>>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>>>
>>> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Thanks.
>>
>>> I think it would be nice to also ensure that once the domain is
>>> finished building (maybe when it's unpaused for the first
>>> time?) d->outstanding_pages is set to 0.  IMO the claim system was
>>> designed to avoid races during domain building, and shouldn't be used
>>> once the domain is already running.
>>>
>>> Thanks, Roger.
>>
>> As a matter of implementation that's already the case by toolstack being "nice"
>> and unconditionally clearing claims after populating the physmap.
> 
> I see.  Another option would be to refuse the unpause a domain if it
> still has pending claims.  However I don't know how that will work out
> with all possible toolstacks.
> 
>> However, I agree the hypervisor should do it on its own. I didn't find a
>> suitable place for it. 
> 
> You could do it in arch_domain_creation_finished().

Except that better wouldn't be arch-specific.

Jan
Roger Pau Monné March 11, 2025, 4:23 p.m. UTC | #5
On Tue, Mar 11, 2025 at 04:45:04PM +0100, Jan Beulich wrote:
> On 11.03.2025 16:42, Roger Pau Monné wrote:
> > On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote:
> >> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote:
> >>> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote:
> >>>> The logic has too many levels of indirection and it's very hard to
> >>>> understand it its current form. Split it between the corner case where
> >>>> the adjustment is bigger than the current claim and the rest to avoid 5
> >>>> auxiliary variables.
> >>>>
> >>>> Add a functional change to prevent negative adjustments from
> >>>> re-increasing the claim. This has the nice side effect of avoiding
> >>>> taking the heap lock here on every free.
> >>>>
> >>>> While at it, fix incorrect field name in nearby comment.
> >>>>
> >>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> >>>
> >>> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
> >>
> >> Thanks.
> >>
> >>> I think it would be nice to also ensure that once the domain is
> >>> finished building (maybe when it's unpaused for the first
> >>> time?) d->outstanding_pages is set to 0.  IMO the claim system was
> >>> designed to avoid races during domain building, and shouldn't be used
> >>> once the domain is already running.
> >>>
> >>> Thanks, Roger.
> >>
> >> As a matter of implementation that's already the case by toolstack being "nice"
> >> and unconditionally clearing claims after populating the physmap.
> > 
> > I see.  Another option would be to refuse the unpause a domain if it
> > still has pending claims.  However I don't know how that will work out
> > with all possible toolstacks.
> > 
> >> However, I agree the hypervisor should do it on its own. I didn't find a
> >> suitable place for it. 
> > 
> > You could do it in arch_domain_creation_finished().
> 
> Except that better wouldn't be arch-specific.

Sorry, I should better have said: you could do it at the same place
where arch_domain_creation_finished() gets called in
domain_unpause_by_systemcontroller().

Thanks, Roger.
Alejandro Vallejo March 11, 2025, 6:35 p.m. UTC | #6
On Tue Mar 11, 2025 at 3:45 PM GMT, Jan Beulich wrote:
> On 11.03.2025 16:42, Roger Pau Monné wrote:
> > On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote:
> >> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote:
> >>> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote:
> >>>> The logic has too many levels of indirection and it's very hard to
> >>>> understand it its current form. Split it between the corner case where
> >>>> the adjustment is bigger than the current claim and the rest to avoid 5
> >>>> auxiliary variables.
> >>>>
> >>>> Add a functional change to prevent negative adjustments from
> >>>> re-increasing the claim. This has the nice side effect of avoiding
> >>>> taking the heap lock here on every free.
> >>>>
> >>>> While at it, fix incorrect field name in nearby comment.
> >>>>
> >>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> >>>
> >>> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
> >>
> >> Thanks.
> >>
> >>> I think it would be nice to also ensure that once the domain is
> >>> finished building (maybe when it's unpaused for the first
> >>> time?) d->outstanding_pages is set to 0.  IMO the claim system was
> >>> designed to avoid races during domain building, and shouldn't be used
> >>> once the domain is already running.
> >>>
> >>> Thanks, Roger.
> >>
> >> As a matter of implementation that's already the case by toolstack being "nice"
> >> and unconditionally clearing claims after populating the physmap.
> > 
> > I see.  Another option would be to refuse the unpause a domain if it
> > still has pending claims.  However I don't know how that will work out
> > with all possible toolstacks.
> > 
> >> However, I agree the hypervisor should do it on its own. I didn't find a
> >> suitable place for it. 
> > 
> > You could do it in arch_domain_creation_finished().
>
> Except that better wouldn't be arch-specific.
>
> Jan

Why would it have to be arch-specific though? As far as the hypervisor is
concerned, it doesn't seem to be.

Cheers,
Alejandro
Jan Beulich March 12, 2025, 8:21 a.m. UTC | #7
On 11.03.2025 19:35, Alejandro Vallejo wrote:
> On Tue Mar 11, 2025 at 3:45 PM GMT, Jan Beulich wrote:
>> On 11.03.2025 16:42, Roger Pau Monné wrote:
>>> On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote:
>>>> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote:
>>>>> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote:
>>>>>> The logic has too many levels of indirection and it's very hard to
>>>>>> understand it its current form. Split it between the corner case where
>>>>>> the adjustment is bigger than the current claim and the rest to avoid 5
>>>>>> auxiliary variables.
>>>>>>
>>>>>> Add a functional change to prevent negative adjustments from
>>>>>> re-increasing the claim. This has the nice side effect of avoiding
>>>>>> taking the heap lock here on every free.
>>>>>>
>>>>>> While at it, fix incorrect field name in nearby comment.
>>>>>>
>>>>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>>>>>
>>>>> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
>>>>
>>>> Thanks.
>>>>
>>>>> I think it would be nice to also ensure that once the domain is
>>>>> finished building (maybe when it's unpaused for the first
>>>>> time?) d->outstanding_pages is set to 0.  IMO the claim system was
>>>>> designed to avoid races during domain building, and shouldn't be used
>>>>> once the domain is already running.
>>>>>
>>>>> Thanks, Roger.
>>>>
>>>> As a matter of implementation that's already the case by toolstack being "nice"
>>>> and unconditionally clearing claims after populating the physmap.
>>>
>>> I see.  Another option would be to refuse the unpause a domain if it
>>> still has pending claims.  However I don't know how that will work out
>>> with all possible toolstacks.
>>>
>>>> However, I agree the hypervisor should do it on its own. I didn't find a
>>>> suitable place for it. 
>>>
>>> You could do it in arch_domain_creation_finished().
>>
>> Except that better wouldn't be arch-specific.
> 
> Why would it have to be arch-specific though? As far as the hypervisor is
> concerned, it doesn't seem to be.

Together with Roger's earlier clarification on his original remark, I fear
I don't understand the question: I asked that it not be arch-specific. And
Roger clarified that he also didn't mean it to be.

Jan
Alejandro Vallejo March 12, 2025, 11:20 a.m. UTC | #8
On Wed Mar 12, 2025 at 8:21 AM GMT, Jan Beulich wrote:
> On 11.03.2025 19:35, Alejandro Vallejo wrote:
> > On Tue Mar 11, 2025 at 3:45 PM GMT, Jan Beulich wrote:
> >> On 11.03.2025 16:42, Roger Pau Monné wrote:
> >>> On Tue, Mar 11, 2025 at 02:53:04PM +0000, Alejandro Vallejo wrote:
> >>>> On Tue Mar 11, 2025 at 12:46 PM GMT, Roger Pau Monné wrote:
> >>>>> On Tue, Mar 04, 2025 at 11:10:00AM +0000, Alejandro Vallejo wrote:
> >>>>>> The logic has too many levels of indirection and it's very hard to
> >>>>>> understand it its current form. Split it between the corner case where
> >>>>>> the adjustment is bigger than the current claim and the rest to avoid 5
> >>>>>> auxiliary variables.
> >>>>>>
> >>>>>> Add a functional change to prevent negative adjustments from
> >>>>>> re-increasing the claim. This has the nice side effect of avoiding
> >>>>>> taking the heap lock here on every free.
> >>>>>>
> >>>>>> While at it, fix incorrect field name in nearby comment.
> >>>>>>
> >>>>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> >>>>>
> >>>>> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
> >>>>
> >>>> Thanks.
> >>>>
> >>>>> I think it would be nice to also ensure that once the domain is
> >>>>> finished building (maybe when it's unpaused for the first
> >>>>> time?) d->outstanding_pages is set to 0.  IMO the claim system was
> >>>>> designed to avoid races during domain building, and shouldn't be used
> >>>>> once the domain is already running.
> >>>>>
> >>>>> Thanks, Roger.
> >>>>
> >>>> As a matter of implementation that's already the case by toolstack being "nice"
> >>>> and unconditionally clearing claims after populating the physmap.
> >>>
> >>> I see.  Another option would be to refuse the unpause a domain if it
> >>> still has pending claims.  However I don't know how that will work out
> >>> with all possible toolstacks.
> >>>
> >>>> However, I agree the hypervisor should do it on its own. I didn't find a
> >>>> suitable place for it. 
> >>>
> >>> You could do it in arch_domain_creation_finished().
> >>
> >> Except that better wouldn't be arch-specific.
> > 
> > Why would it have to be arch-specific though? As far as the hypervisor is
> > concerned, it doesn't seem to be.
>
> Together with Roger's earlier clarification on his original remark, I fear
> I don't understand the question: I asked that it not be arch-specific. And
> Roger clarified that he also didn't mean it to be.
>
> Jan

Bah, I misread you (and my IMAP server annoyingly decided enough was enough and
declared Roger's answer was one too many and deferred sending it until much
later).

Too much going on, too little attention. Apologies for the noise.

I'll send a patch at some point with that adjustment.

Cheers,
Alejandro
diff mbox series

Patch

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 1bf070c8c5df..5f9c9305ef37 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -490,31 +490,30 @@  static long outstanding_claims; /* total outstanding claims by all domains */
 
 unsigned long domain_adjust_tot_pages(struct domain *d, long pages)
 {
-    long dom_before, dom_after, dom_claimed, sys_before, sys_after;
-
     ASSERT(rspin_is_locked(&d->page_alloc_lock));
     d->tot_pages += pages;
 
     /*
-     * can test d->claimed_pages race-free because it can only change
+     * can test d->outstanding_pages race-free because it can only change
      * if d->page_alloc_lock and heap_lock are both held, see also
      * domain_set_outstanding_pages below
      */
-    if ( !d->outstanding_pages )
+    if ( !d->outstanding_pages || pages <= 0 )
         goto out;
 
     spin_lock(&heap_lock);
-    /* adjust domain outstanding pages; may not go negative */
-    dom_before = d->outstanding_pages;
-    dom_after = dom_before - pages;
-    BUG_ON(dom_before < 0);
-    dom_claimed = dom_after < 0 ? 0 : dom_after;
-    d->outstanding_pages = dom_claimed;
-    /* flag accounting bug if system outstanding_claims would go negative */
-    sys_before = outstanding_claims;
-    sys_after = sys_before - (dom_before - dom_claimed);
-    BUG_ON(sys_after < 0);
-    outstanding_claims = sys_after;
+    BUG_ON(outstanding_claims < d->outstanding_pages);
+    if ( d->outstanding_pages < pages )
+    {
+        /* `pages` exceeds the domain's outstanding count. Zero it out. */
+        outstanding_claims -= d->outstanding_pages;
+        d->outstanding_pages = 0;
+    }
+    else
+    {
+        outstanding_claims -= pages;
+        d->outstanding_pages -= pages;
+    }
     spin_unlock(&heap_lock);
 
 out: