diff mbox series

[v3,2/3] pirq_cleanup_check() leaks

Message ID 5641f8eb-5736-8ccc-122b-b3b47c1bac28@suse.com (mailing list archive)
State New, archived
Headers show
Series new CONFIG_HAS_PIRQ and extra_guest_irqs adjustment | expand

Commit Message

Jan Beulich July 27, 2023, 7:38 a.m. UTC
Its original introduction had two issues: For one the "common" part of
the checks (carried out in the macro) was inverted. And then after
removal from the radix tree the structure wasn't scheduled for freeing.
(All structures still left in the radix tree would be freed upon domain
destruction, though.)

For the freeing to be safe even if it didn't use RCU (i.e. to avoid use-
after-free), re-arrange checks/operations in evtchn_close(), such that
the pointer wouldn't be used anymore after calling pirq_cleanup_check()
(noting that unmap_domain_pirq_emuirq() itself calls the function in the
success case).

Fixes: c24536b636f2 ("replace d->nr_pirqs sized arrays with radix tree")
Fixes: 79858fee307c ("xen: fix hvm_domain_use_pirq's behavior")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
This is fallout from me looking into whether the function and macro of
the same name could be suitably split, to please Misra rule 5.5. The
idea was apparently that the check done in the macro is the "common"
part, and the actual function would be per-architecture. Pretty clearly
this, if need be, could also be achieved by naming the actual function
e.g. arch_pirq_cleanup_check().

Despite my testing of this (to a certain degree), I'm wary of the
change, since the code has been the way it was for about 12 years. It
feels like I'm overlooking something crucial ...

The wrong check is also what explains why Arm got away without
implementing the function (prior to "restrict concept of pIRQ to x86"):
The compiler simply eliminated the two calls from event_channel.c.
---
v3: New.

Comments

Roger Pau Monné July 1, 2024, 8:55 a.m. UTC | #1
On Thu, Jul 27, 2023 at 09:38:29AM +0200, Jan Beulich wrote:
> Its original introduction had two issues: For one the "common" part of
> the checks (carried out in the macro) was inverted.

Is the current logic in evtchn_close() really malfunctioning?

pirq->evtchn = 0;
pirq_cleanup_check(pirq, d1); <- cleanup for PV domains
if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
    unmap_domain_pirq_emuirq(d1, pirq->pirq); <- cleanup for HVM domains

It would seem to me the pirq_cleanup_check() call just after setting
evtchn = 0 was done to account for PV domains, while the second
(hidden) pirq_cleanup_check() call in unmap_domain_pirq_emuirq() would
do the cleanup for HVM domains.

Maybe there's something that I'm missing, I have to admit the PIRQ
logic is awfully complicated, even more when we mix the HVM PIRQ
stuff.

> And then after
> removal from the radix tree the structure wasn't scheduled for freeing.
> (All structures still left in the radix tree would be freed upon domain
> destruction, though.)

So if my understanding is correct, we didn't have a leak due to the
missing free_pirq_struct() because the inverted check in
pirq_cleanup_check() macro prevented the removal from the radix tree,
and so stale entries would be left there and freed at domain
destruction?

> For the freeing to be safe even if it didn't use RCU (i.e. to avoid use-
> after-free), re-arrange checks/operations in evtchn_close(), such that
> the pointer wouldn't be used anymore after calling pirq_cleanup_check()
> (noting that unmap_domain_pirq_emuirq() itself calls the function in the
> success case).
> 
> Fixes: c24536b636f2 ("replace d->nr_pirqs sized arrays with radix tree")
> Fixes: 79858fee307c ("xen: fix hvm_domain_use_pirq's behavior")

See the comment above about the call to pirq_cleanup_check() in
unmap_domain_pirq_emuirq(), which might imply 79858fee307c is not
wrong, just confusing.

> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> This is fallout from me looking into whether the function and macro of
> the same name could be suitably split, to please Misra rule 5.5. The
> idea was apparently that the check done in the macro is the "common"
> part, and the actual function would be per-architecture. Pretty clearly
> this, if need be, could also be achieved by naming the actual function
> e.g. arch_pirq_cleanup_check().
> 
> Despite my testing of this (to a certain degree), I'm wary of the
> change, since the code has been the way it was for about 12 years. It
> feels like I'm overlooking something crucial ...
> 
> The wrong check is also what explains why Arm got away without
> implementing the function (prior to "restrict concept of pIRQ to x86"):
> The compiler simply eliminated the two calls from event_channel.c.
> ---
> v3: New.
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1349,6 +1349,7 @@ void (pirq_cleanup_check)(struct pirq *p
>  
>      if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
>          BUG();
> +    free_pirq_struct(pirq);
>  }
>  
>  /* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -711,9 +711,10 @@ int evtchn_close(struct domain *d1, int
>              if ( !is_hvm_domain(d1) )
>                  pirq_guest_unbind(d1, pirq);
>              pirq->evtchn = 0;
> -            pirq_cleanup_check(pirq, d1);
> -            if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
> -                unmap_domain_pirq_emuirq(d1, pirq->pirq);
> +            if ( !is_hvm_domain(d1) ||
> +                 domain_pirq_to_irq(d1, pirq->pirq) <= 0 ||
> +                 unmap_domain_pirq_emuirq(d1, pirq->pirq) < 0 )

pirq_cleanup_check() already calls pirq_cleanup_check() itself.  Could
you please add a comment to note that unmap_domain_pirq_emuirq()
succeeding implies the call to pirq_cleanup_check() has already been
done?

Otherwise the logic here looks unbalanced by skipping the
pirq_cleanup_check() when unmap_domain_pirq_emuirq() succeeds.

> +                pirq_cleanup_check(pirq, d1);
>          }
>          unlink_pirq_port(chn1, d1->vcpu[chn1->notify_vcpu_id]);
>          break;
> --- a/xen/include/xen/irq.h
> +++ b/xen/include/xen/irq.h
> @@ -158,7 +158,7 @@ extern struct pirq *pirq_get_info(struct
>  void pirq_cleanup_check(struct pirq *, struct domain *);
>  
>  #define pirq_cleanup_check(pirq, d) \
> -    ((pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)
> +    (!(pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)

Not that you need to fix it here, but why not place this check in
pirq_cleanup_check() itself?

Thanks, Roger.
Jan Beulich July 1, 2024, 9:47 a.m. UTC | #2
On 01.07.2024 10:55, Roger Pau Monné wrote:
> On Thu, Jul 27, 2023 at 09:38:29AM +0200, Jan Beulich wrote:
>> Its original introduction had two issues: For one the "common" part of
>> the checks (carried out in the macro) was inverted.
> 
> Is the current logic in evtchn_close() really malfunctioning?

First: I'm getting the impression that this entire comment doesn't relate
to the part of the description above, but to the 2nd paragraph further
down. Otherwise I'm afraid I may not properly understand your question,
and hence my response below may not make any sense at all.

> pirq->evtchn = 0;
> pirq_cleanup_check(pirq, d1); <- cleanup for PV domains
> if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
>     unmap_domain_pirq_emuirq(d1, pirq->pirq); <- cleanup for HVM domains
> 
> It would seem to me the pirq_cleanup_check() call just after setting
> evtchn = 0 was done to account for PV domains, while the second
> (hidden) pirq_cleanup_check() call in unmap_domain_pirq_emuirq() would
> do the cleanup for HVM domains.
> 
> Maybe there's something that I'm missing, I have to admit the PIRQ
> logic is awfully complicated, even more when we mix the HVM PIRQ
> stuff.

If you look at pirq_cleanup_check() you'll notice that it takes care
of one HVM case as well (the not emuirq one, i.e. particularly PVH,
but note also how physdev_hvm_map_pirq() calls map_domain_emuirq_pirq()
only conditionally). Plus the crucial aspect of the 2nd paragraph of
the description is that past calling pirq_cleanup_check() it is not
really valid anymore to (blindly) de-reference the struct pirq pointer
we hold in hands. The is_hvm_domain() qualification wasn't enough,
since - as said - it's only one of the possibilities that would allow
the pirq to remain legal to use past the call, when having taken the
function's

        if ( pirq->arch.hvm.emuirq != IRQ_UNBOUND )
            return;

path. A 2nd would be taking the

        if ( !pt_pirq_cleanup_check(&pirq->arch.hvm.dpci) )
            return;

path (i.e. a still in use pass-through IRQ), but the 3rd would still
end in the struct pirq being purged even for HVM.

>> And then after
>> removal from the radix tree the structure wasn't scheduled for freeing.
>> (All structures still left in the radix tree would be freed upon domain
>> destruction, though.)
> 
> So if my understanding is correct, we didn't have a leak due to the
> missing free_pirq_struct() because the inverted check in
> pirq_cleanup_check() macro prevented the removal from the radix tree,
> and so stale entries would be left there and freed at domain
> destruction?

That's the understanding I had come to, yes. What I wasn't entirely
sure about (see the 2nd post-commit-message remark) is why the entry
being left in the radix tree never caused any problems. Presumably
that's a result of pirq_get_info() first checking whether an entry is
already there, allocating a new one only for previously empty slots.

>> --- a/xen/common/event_channel.c
>> +++ b/xen/common/event_channel.c
>> @@ -711,9 +711,10 @@ int evtchn_close(struct domain *d1, int
>>              if ( !is_hvm_domain(d1) )
>>                  pirq_guest_unbind(d1, pirq);
>>              pirq->evtchn = 0;
>> -            pirq_cleanup_check(pirq, d1);
>> -            if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
>> -                unmap_domain_pirq_emuirq(d1, pirq->pirq);
>> +            if ( !is_hvm_domain(d1) ||
>> +                 domain_pirq_to_irq(d1, pirq->pirq) <= 0 ||
>> +                 unmap_domain_pirq_emuirq(d1, pirq->pirq) < 0 )
> 
> pirq_cleanup_check() already calls pirq_cleanup_check() itself.  Could
> you please add a comment to note that unmap_domain_pirq_emuirq()
> succeeding implies the call to pirq_cleanup_check() has already been
> done?
> 
> Otherwise the logic here looks unbalanced by skipping the
> pirq_cleanup_check() when unmap_domain_pirq_emuirq() succeeds.

Sure, added:

                /*
                 * The successful path of unmap_domain_pirq_emuirq() will have
                 * called pirq_cleanup_check() already.
                 */

>> --- a/xen/include/xen/irq.h
>> +++ b/xen/include/xen/irq.h
>> @@ -158,7 +158,7 @@ extern struct pirq *pirq_get_info(struct
>>  void pirq_cleanup_check(struct pirq *, struct domain *);
>>  
>>  #define pirq_cleanup_check(pirq, d) \
>> -    ((pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)
>> +    (!(pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)
> 
> Not that you need to fix it here, but why not place this check in
> pirq_cleanup_check() itself?

See the first of the post-commit-message remarks: The goal was to not
require every arch to replicate that check. At the time it wasn't
clear (to me at least) that the entire concept of pIRQ would likely
remain an x86 special thing anyway.

Jan
Roger Pau Monné July 1, 2024, 11:13 a.m. UTC | #3
On Mon, Jul 01, 2024 at 11:47:34AM +0200, Jan Beulich wrote:
> On 01.07.2024 10:55, Roger Pau Monné wrote:
> > On Thu, Jul 27, 2023 at 09:38:29AM +0200, Jan Beulich wrote:
> >> Its original introduction had two issues: For one the "common" part of
> >> the checks (carried out in the macro) was inverted.
> > 
> > Is the current logic in evtchn_close() really malfunctioning?
> 
> First: I'm getting the impression that this entire comment doesn't relate
> to the part of the description above, but to the 2nd paragraph further
> down. Otherwise I'm afraid I may not properly understand your question,
> and hence my response below may not make any sense at all.
> 
> > pirq->evtchn = 0;
> > pirq_cleanup_check(pirq, d1); <- cleanup for PV domains
> > if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
> >     unmap_domain_pirq_emuirq(d1, pirq->pirq); <- cleanup for HVM domains
> > 
> > It would seem to me the pirq_cleanup_check() call just after setting
> > evtchn = 0 was done to account for PV domains, while the second
> > (hidden) pirq_cleanup_check() call in unmap_domain_pirq_emuirq() would
> > do the cleanup for HVM domains.
> > 
> > Maybe there's something that I'm missing, I have to admit the PIRQ
> > logic is awfully complicated, even more when we mix the HVM PIRQ
> > stuff.
> 
> If you look at pirq_cleanup_check() you'll notice that it takes care
> of one HVM case as well (the not emuirq one, i.e. particularly PVH,
> but note also how physdev_hvm_map_pirq() calls map_domain_emuirq_pirq()
> only conditionally). Plus the crucial aspect of the 2nd paragraph of
> the description is that past calling pirq_cleanup_check() it is not
> really valid anymore to (blindly) de-reference the struct pirq pointer
> we hold in hands. The is_hvm_domain() qualification wasn't enough,
> since - as said - it's only one of the possibilities that would allow
> the pirq to remain legal to use past the call, when having taken the
> function's
> 
>         if ( pirq->arch.hvm.emuirq != IRQ_UNBOUND )
>             return;
> 
> path. A 2nd would be taking the
> 
>         if ( !pt_pirq_cleanup_check(&pirq->arch.hvm.dpci) )
>             return;
> 
> path (i.e. a still in use pass-through IRQ), but the 3rd would still
> end in the struct pirq being purged even for HVM.

Right, I was missing that if pirq is properly freed then further
usages of it after the pirq_cleanup_check() would be use after free.

> >> And then after
> >> removal from the radix tree the structure wasn't scheduled for freeing.
> >> (All structures still left in the radix tree would be freed upon domain
> >> destruction, though.)
> > 
> > So if my understanding is correct, we didn't have a leak due to the
> > missing free_pirq_struct() because the inverted check in
> > pirq_cleanup_check() macro prevented the removal from the radix tree,
> > and so stale entries would be left there and freed at domain
> > destruction?
> 
> That's the understanding I had come to, yes. What I wasn't entirely
> sure about (see the 2nd post-commit-message remark) is why the entry
> being left in the radix tree never caused any problems. Presumably
> that's a result of pirq_get_info() first checking whether an entry is
> already there, allocating a new one only for previously empty slots.

Yes, I came to the same conclusion, that not freeing wasn't an issue
as Xen would re-use the old entry.  Hopefully it's clean enough to not
cause issues when re-using.

> >> --- a/xen/common/event_channel.c
> >> +++ b/xen/common/event_channel.c
> >> @@ -711,9 +711,10 @@ int evtchn_close(struct domain *d1, int
> >>              if ( !is_hvm_domain(d1) )
> >>                  pirq_guest_unbind(d1, pirq);
> >>              pirq->evtchn = 0;
> >> -            pirq_cleanup_check(pirq, d1);
> >> -            if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
> >> -                unmap_domain_pirq_emuirq(d1, pirq->pirq);
> >> +            if ( !is_hvm_domain(d1) ||
> >> +                 domain_pirq_to_irq(d1, pirq->pirq) <= 0 ||
> >> +                 unmap_domain_pirq_emuirq(d1, pirq->pirq) < 0 )
> > 
> > pirq_cleanup_check() already calls pirq_cleanup_check() itself.  Could
> > you please add a comment to note that unmap_domain_pirq_emuirq()
> > succeeding implies the call to pirq_cleanup_check() has already been
> > done?
> > 
> > Otherwise the logic here looks unbalanced by skipping the
> > pirq_cleanup_check() when unmap_domain_pirq_emuirq() succeeds.
> 
> Sure, added:
> 
>                 /*
>                  * The successful path of unmap_domain_pirq_emuirq() will have
>                  * called pirq_cleanup_check() already.
>                  */

With that added:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> >> --- a/xen/include/xen/irq.h
> >> +++ b/xen/include/xen/irq.h
> >> @@ -158,7 +158,7 @@ extern struct pirq *pirq_get_info(struct
> >>  void pirq_cleanup_check(struct pirq *, struct domain *);
> >>  
> >>  #define pirq_cleanup_check(pirq, d) \
> >> -    ((pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)
> >> +    (!(pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)
> > 
> > Not that you need to fix it here, but why not place this check in
> > pirq_cleanup_check() itself?
> 
> See the first of the post-commit-message remarks: The goal was to not
> require every arch to replicate that check. At the time it wasn't
> clear (to me at least) that the entire concept of pIRQ would likely
> remain an x86 special thing anyway.

Anyway, such change would better be done in a separate commit anyway.

Thanks, Roger.
Jan Beulich July 1, 2024, 1:21 p.m. UTC | #4
On 01.07.2024 13:13, Roger Pau Monné wrote:
> On Mon, Jul 01, 2024 at 11:47:34AM +0200, Jan Beulich wrote:
>> On 01.07.2024 10:55, Roger Pau Monné wrote:
>>> On Thu, Jul 27, 2023 at 09:38:29AM +0200, Jan Beulich wrote:
>>>> Its original introduction had two issues: For one the "common" part of
>>>> the checks (carried out in the macro) was inverted.
>>>
>>> Is the current logic in evtchn_close() really malfunctioning?
>>
>> First: I'm getting the impression that this entire comment doesn't relate
>> to the part of the description above, but to the 2nd paragraph further
>> down. Otherwise I'm afraid I may not properly understand your question,
>> and hence my response below may not make any sense at all.
>>
>>> pirq->evtchn = 0;
>>> pirq_cleanup_check(pirq, d1); <- cleanup for PV domains
>>> if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
>>>     unmap_domain_pirq_emuirq(d1, pirq->pirq); <- cleanup for HVM domains
>>>
>>> It would seem to me the pirq_cleanup_check() call just after setting
>>> evtchn = 0 was done to account for PV domains, while the second
>>> (hidden) pirq_cleanup_check() call in unmap_domain_pirq_emuirq() would
>>> do the cleanup for HVM domains.
>>>
>>> Maybe there's something that I'm missing, I have to admit the PIRQ
>>> logic is awfully complicated, even more when we mix the HVM PIRQ
>>> stuff.
>>
>> If you look at pirq_cleanup_check() you'll notice that it takes care
>> of one HVM case as well (the not emuirq one, i.e. particularly PVH,
>> but note also how physdev_hvm_map_pirq() calls map_domain_emuirq_pirq()
>> only conditionally). Plus the crucial aspect of the 2nd paragraph of
>> the description is that past calling pirq_cleanup_check() it is not
>> really valid anymore to (blindly) de-reference the struct pirq pointer
>> we hold in hands. The is_hvm_domain() qualification wasn't enough,
>> since - as said - it's only one of the possibilities that would allow
>> the pirq to remain legal to use past the call, when having taken the
>> function's
>>
>>         if ( pirq->arch.hvm.emuirq != IRQ_UNBOUND )
>>             return;
>>
>> path. A 2nd would be taking the
>>
>>         if ( !pt_pirq_cleanup_check(&pirq->arch.hvm.dpci) )
>>             return;
>>
>> path (i.e. a still in use pass-through IRQ), but the 3rd would still
>> end in the struct pirq being purged even for HVM.
> 
> Right, I was missing that if pirq is properly freed then further
> usages of it after the pirq_cleanup_check() would be use after free.
> 
>>>> And then after
>>>> removal from the radix tree the structure wasn't scheduled for freeing.
>>>> (All structures still left in the radix tree would be freed upon domain
>>>> destruction, though.)
>>>
>>> So if my understanding is correct, we didn't have a leak due to the
>>> missing free_pirq_struct() because the inverted check in
>>> pirq_cleanup_check() macro prevented the removal from the radix tree,
>>> and so stale entries would be left there and freed at domain
>>> destruction?
>>
>> That's the understanding I had come to, yes. What I wasn't entirely
>> sure about (see the 2nd post-commit-message remark) is why the entry
>> being left in the radix tree never caused any problems. Presumably
>> that's a result of pirq_get_info() first checking whether an entry is
>> already there, allocating a new one only for previously empty slots.
> 
> Yes, I came to the same conclusion, that not freeing wasn't an issue
> as Xen would re-use the old entry.  Hopefully it's clean enough to not
> cause issues when re-using.
> 
>>>> --- a/xen/common/event_channel.c
>>>> +++ b/xen/common/event_channel.c
>>>> @@ -711,9 +711,10 @@ int evtchn_close(struct domain *d1, int
>>>>              if ( !is_hvm_domain(d1) )
>>>>                  pirq_guest_unbind(d1, pirq);
>>>>              pirq->evtchn = 0;
>>>> -            pirq_cleanup_check(pirq, d1);
>>>> -            if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
>>>> -                unmap_domain_pirq_emuirq(d1, pirq->pirq);
>>>> +            if ( !is_hvm_domain(d1) ||
>>>> +                 domain_pirq_to_irq(d1, pirq->pirq) <= 0 ||
>>>> +                 unmap_domain_pirq_emuirq(d1, pirq->pirq) < 0 )
>>>
>>> pirq_cleanup_check() already calls pirq_cleanup_check() itself.  Could
>>> you please add a comment to note that unmap_domain_pirq_emuirq()
>>> succeeding implies the call to pirq_cleanup_check() has already been
>>> done?
>>>
>>> Otherwise the logic here looks unbalanced by skipping the
>>> pirq_cleanup_check() when unmap_domain_pirq_emuirq() succeeds.
>>
>> Sure, added:
>>
>>                 /*
>>                  * The successful path of unmap_domain_pirq_emuirq() will have
>>                  * called pirq_cleanup_check() already.
>>                  */
> 
> With that added:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks Roger.

Oleksii - would you please consider giving this long-standing bug fix a
release ack?

Thanks, Jan
Oleksii Kurochko July 1, 2024, 3:05 p.m. UTC | #5
On Mon, 2024-07-01 at 15:21 +0200, Jan Beulich wrote:
> On 01.07.2024 13:13, Roger Pau Monné wrote:
> > On Mon, Jul 01, 2024 at 11:47:34AM +0200, Jan Beulich wrote:
> > > On 01.07.2024 10:55, Roger Pau Monné wrote:
> > > > On Thu, Jul 27, 2023 at 09:38:29AM +0200, Jan Beulich wrote:
> > > > > Its original introduction had two issues: For one the
> > > > > "common" part of
> > > > > the checks (carried out in the macro) was inverted.
> > > > 
> > > > Is the current logic in evtchn_close() really malfunctioning?
> > > 
> > > First: I'm getting the impression that this entire comment
> > > doesn't relate
> > > to the part of the description above, but to the 2nd paragraph
> > > further
> > > down. Otherwise I'm afraid I may not properly understand your
> > > question,
> > > and hence my response below may not make any sense at all.
> > > 
> > > > pirq->evtchn = 0;
> > > > pirq_cleanup_check(pirq, d1); <- cleanup for PV domains
> > > > if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) >
> > > > 0 )
> > > >     unmap_domain_pirq_emuirq(d1, pirq->pirq); <- cleanup for
> > > > HVM domains
> > > > 
> > > > It would seem to me the pirq_cleanup_check() call just after
> > > > setting
> > > > evtchn = 0 was done to account for PV domains, while the second
> > > > (hidden) pirq_cleanup_check() call in
> > > > unmap_domain_pirq_emuirq() would
> > > > do the cleanup for HVM domains.
> > > > 
> > > > Maybe there's something that I'm missing, I have to admit the
> > > > PIRQ
> > > > logic is awfully complicated, even more when we mix the HVM
> > > > PIRQ
> > > > stuff.
> > > 
> > > If you look at pirq_cleanup_check() you'll notice that it takes
> > > care
> > > of one HVM case as well (the not emuirq one, i.e. particularly
> > > PVH,
> > > but note also how physdev_hvm_map_pirq() calls
> > > map_domain_emuirq_pirq()
> > > only conditionally). Plus the crucial aspect of the 2nd paragraph
> > > of
> > > the description is that past calling pirq_cleanup_check() it is
> > > not
> > > really valid anymore to (blindly) de-reference the struct pirq
> > > pointer
> > > we hold in hands. The is_hvm_domain() qualification wasn't
> > > enough,
> > > since - as said - it's only one of the possibilities that would
> > > allow
> > > the pirq to remain legal to use past the call, when having taken
> > > the
> > > function's
> > > 
> > >         if ( pirq->arch.hvm.emuirq != IRQ_UNBOUND )
> > >             return;
> > > 
> > > path. A 2nd would be taking the
> > > 
> > >         if ( !pt_pirq_cleanup_check(&pirq->arch.hvm.dpci) )
> > >             return;
> > > 
> > > path (i.e. a still in use pass-through IRQ), but the 3rd would
> > > still
> > > end in the struct pirq being purged even for HVM.
> > 
> > Right, I was missing that if pirq is properly freed then further
> > usages of it after the pirq_cleanup_check() would be use after
> > free.
> > 
> > > > > And then after
> > > > > removal from the radix tree the structure wasn't scheduled
> > > > > for freeing.
> > > > > (All structures still left in the radix tree would be freed
> > > > > upon domain
> > > > > destruction, though.)
> > > > 
> > > > So if my understanding is correct, we didn't have a leak due to
> > > > the
> > > > missing free_pirq_struct() because the inverted check in
> > > > pirq_cleanup_check() macro prevented the removal from the radix
> > > > tree,
> > > > and so stale entries would be left there and freed at domain
> > > > destruction?
> > > 
> > > That's the understanding I had come to, yes. What I wasn't
> > > entirely
> > > sure about (see the 2nd post-commit-message remark) is why the
> > > entry
> > > being left in the radix tree never caused any problems.
> > > Presumably
> > > that's a result of pirq_get_info() first checking whether an
> > > entry is
> > > already there, allocating a new one only for previously empty
> > > slots.
> > 
> > Yes, I came to the same conclusion, that not freeing wasn't an
> > issue
> > as Xen would re-use the old entry.  Hopefully it's clean enough to
> > not
> > cause issues when re-using.
> > 
> > > > > --- a/xen/common/event_channel.c
> > > > > +++ b/xen/common/event_channel.c
> > > > > @@ -711,9 +711,10 @@ int evtchn_close(struct domain *d1, int
> > > > >              if ( !is_hvm_domain(d1) )
> > > > >                  pirq_guest_unbind(d1, pirq);
> > > > >              pirq->evtchn = 0;
> > > > > -            pirq_cleanup_check(pirq, d1);
> > > > > -            if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1,
> > > > > pirq->pirq) > 0 )
> > > > > -                unmap_domain_pirq_emuirq(d1, pirq->pirq);
> > > > > +            if ( !is_hvm_domain(d1) ||
> > > > > +                 domain_pirq_to_irq(d1, pirq->pirq) <= 0 ||
> > > > > +                 unmap_domain_pirq_emuirq(d1, pirq->pirq) <
> > > > > 0 )
> > > > 
> > > > pirq_cleanup_check() already calls pirq_cleanup_check()
> > > > itself.  Could
> > > > you please add a comment to note that
> > > > unmap_domain_pirq_emuirq()
> > > > succeeding implies the call to pirq_cleanup_check() has already
> > > > been
> > > > done?
> > > > 
> > > > Otherwise the logic here looks unbalanced by skipping the
> > > > pirq_cleanup_check() when unmap_domain_pirq_emuirq() succeeds.
> > > 
> > > Sure, added:
> > > 
> > >                 /*
> > >                  * The successful path of
> > > unmap_domain_pirq_emuirq() will have
> > >                  * called pirq_cleanup_check() already.
> > >                  */
> > 
> > With that added:
> > 
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks Roger.
> 
> Oleksii - would you please consider giving this long-standing bug fix
> a
> release ack?
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>

~ Oleksii
diff mbox series

Patch

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1349,6 +1349,7 @@  void (pirq_cleanup_check)(struct pirq *p
 
     if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
         BUG();
+    free_pirq_struct(pirq);
 }
 
 /* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -711,9 +711,10 @@  int evtchn_close(struct domain *d1, int
             if ( !is_hvm_domain(d1) )
                 pirq_guest_unbind(d1, pirq);
             pirq->evtchn = 0;
-            pirq_cleanup_check(pirq, d1);
-            if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
-                unmap_domain_pirq_emuirq(d1, pirq->pirq);
+            if ( !is_hvm_domain(d1) ||
+                 domain_pirq_to_irq(d1, pirq->pirq) <= 0 ||
+                 unmap_domain_pirq_emuirq(d1, pirq->pirq) < 0 )
+                pirq_cleanup_check(pirq, d1);
         }
         unlink_pirq_port(chn1, d1->vcpu[chn1->notify_vcpu_id]);
         break;
--- a/xen/include/xen/irq.h
+++ b/xen/include/xen/irq.h
@@ -158,7 +158,7 @@  extern struct pirq *pirq_get_info(struct
 void pirq_cleanup_check(struct pirq *, struct domain *);
 
 #define pirq_cleanup_check(pirq, d) \
-    ((pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)
+    (!(pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)
 
 extern void pirq_guest_eoi(struct pirq *);
 extern void desc_guest_eoi(struct irq_desc *, struct pirq *);