diff mbox series

[v5] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs

Message ID 20200310124353.4337-1-paul@xen.org (mailing list archive)
State New, archived
Headers show
Series [v5] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs | expand

Commit Message

Paul Durrant March 10, 2020, 12:43 p.m. UTC
From: Varad Gautam <vrd@amazon.de>

XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
In that scenario, it is possible to receive multiple __pirq_guest_unbind
calls for the same pirq from domain_kill, if the pirq has not yet been
removed from the domain's pirq_tree, as:
  domain_kill()
    -> domain_relinquish_resources()
      -> pci_release_devices()
        -> pci_clean_dpci_irq()
          -> pirq_guest_unbind()
            -> __pirq_guest_unbind()

For a shared pirq (nr_guests > 1), the first call would zap the current
domain from the pirq's guests[] list, but the action handler is never freed
as there are other guests using this pirq. As a result, on the second call,
__pirq_guest_unbind searches for the current domain which has been removed
from the guests[] list, and hits a BUG_ON.

Make __pirq_guest_unbind safe to be called multiple times by letting xen
continue if a shared pirq has already been unbound from this guest. The
PIRQ will be cleaned up from the domain's pirq_tree during the destruction
in complete_domain_destroy anyway.

Signed-off-by: Varad Gautam <vrd@amazon.de>
[taking over from Varad at v4]
Signed-off-by: Paul Durrant <paul@xen.org>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Roger suggested cleaning the entry from the domain pirq_tree so that
we need not make it safe to re-call __pirq_guest_unbind(). This seems like
a reasonable suggestion but the semantics of the code are almost
impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
the name of struct so you generally have little idea what it actally means)
so I prefer to stick with a small fix that I can actually reason about.

v5:
 - BUG_ON(!shareable) rather than ASSERT(shareable)
 - Drop ASSERT on nr_guests

v4:
 - Re-work the guest array search to make it clearer

v3:
  - Style fixups

v2:
 - Split the check on action->nr_guests > 0 and make it an ASSERT
---
 xen/arch/x86/irq.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

Comments

Jan Beulich March 10, 2020, 1:56 p.m. UTC | #1
On 10.03.2020 13:43, paul@xen.org wrote:
> v5:
>  - BUG_ON(!shareable) rather than ASSERT(shareable)
>  - Drop ASSERT on nr_guests

Why drop, rather than move ...

> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1680,9 +1680,22 @@ static irq_guest_action_t *__pirq_guest_unbind(
>  
>      BUG_ON(!(desc->status & IRQ_GUEST));
>  
> -    for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
> -        continue;
> -    BUG_ON(i == action->nr_guests);
> +    for ( i = 0; i < action->nr_guests; i++ )
> +        if ( action->guest[i] == d )
> +            break;
> +
> +    if ( i == action->nr_guests ) /* No matching entry */
> +    {

... back here? (This would be easy enough to take care of while
committing, iff we decided to go with this variant.)

Jan
Paul Durrant March 10, 2020, 2:06 p.m. UTC | #2
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 10 March 2020 13:57
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; Varad Gautam <vrd@amazon.de>; Julien Grall <julien@xen.org>; Roger
> Pau Monné <roger.pau@citrix.com>; Andrew Cooper <andrew.cooper3@citrix.com>
> Subject: Re: [PATCH v5] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
> 
> On 10.03.2020 13:43, paul@xen.org wrote:
> > v5:
> >  - BUG_ON(!shareable) rather than ASSERT(shareable)
> >  - Drop ASSERT on nr_guests
> 
> Why drop, rather than move ...
> 
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -1680,9 +1680,22 @@ static irq_guest_action_t *__pirq_guest_unbind(
> >
> >      BUG_ON(!(desc->status & IRQ_GUEST));
> >
> > -    for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
> > -        continue;
> > -    BUG_ON(i == action->nr_guests);
> > +    for ( i = 0; i < action->nr_guests; i++ )
> > +        if ( action->guest[i] == d )
> > +            break;
> > +
> > +    if ( i == action->nr_guests ) /* No matching entry */
> > +    {
> 
> ... back here? (This would be easy enough to take care of while
> committing, iff we decided to go with this variant.)

Ok, let's see how your alternative goes.

  Paul

> 
> Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index cc2eb8e925..a3701354e6 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1680,9 +1680,22 @@  static irq_guest_action_t *__pirq_guest_unbind(
 
     BUG_ON(!(desc->status & IRQ_GUEST));
 
-    for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
-        continue;
-    BUG_ON(i == action->nr_guests);
+    for ( i = 0; i < action->nr_guests; i++ )
+        if ( action->guest[i] == d )
+            break;
+
+    if ( i == action->nr_guests ) /* No matching entry */
+    {
+        /*
+         * In case the pirq was shared, unbound for this domain in an earlier
+         * call, but still existed on the domain's pirq_tree, we still reach
+         * here if there are any later unbind calls on the same pirq. Return
+         * if such an unbind happens.
+         */
+        BUG_ON(!action->shareable);
+        return NULL;
+    }
+
     memmove(&action->guest[i], &action->guest[i+1],
             (action->nr_guests-i-1) * sizeof(action->guest[0]));
     action->nr_guests--;