diff mbox series

[2/6] xen: add bitmap to indicate per-domain state changes

Message ID 20241023131005.32144-3-jgross@suse.com (mailing list archive)
State New
Headers show
Series remove libxenctrl usage from xenstored | expand

Commit Message

Jürgen Groß Oct. 23, 2024, 1:10 p.m. UTC
Add a bitmap with one bit per possible domid indicating the respective
domain has changed its state (created, deleted, dying, crashed,
shutdown).

Registering the VIRQ_DOM_EXC event will result in setting the bits for
all existing domains and resetting all other bits.

Resetting a bit will be done in a future patch.

This information is needed for Xenstore to keep track of all domains.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/domain.c        | 21 +++++++++++++++++++++
 xen/common/event_channel.c |  2 ++
 xen/include/xen/sched.h    |  2 ++
 3 files changed, 25 insertions(+)

Comments

Jan Beulich Oct. 31, 2024, 10:59 a.m. UTC | #1
On 23.10.2024 15:10, Juergen Gross wrote:
> Add a bitmap with one bit per possible domid indicating the respective
> domain has changed its state (created, deleted, dying, crashed,
> shutdown).
> 
> Registering the VIRQ_DOM_EXC event will result in setting the bits for
> all existing domains and resetting all other bits.

That's furthering the "there can be only one consumer" model that also
is used for VIRQ_DOM_EXC itself. I consider the existing model flawed
(nothing keeps a 2nd party with sufficient privilege from invoking
XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification
from whoever had first requested it), and hence I dislike this being
extended. Conceivably multiple parties may indeed be interested in
this kind of information. At which point resetting state when the vIRQ
is bound is questionable (or the data would need to become per-domain
rather than global, or even yet more fine-grained, albeit
->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s).

> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -138,6 +138,22 @@ bool __read_mostly vmtrace_available;
>  
>  bool __read_mostly vpmu_is_available;
>  
> +static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);

While it won't alter the size of the array, I think DOMID_FIRST_RESERVED
would be more logical to use here and ...

> +void domain_reset_states(void)
> +{
> +    struct domain *d;
> +
> +    bitmap_zero(dom_state_changed, DOMID_MASK + 1);

... here.

> +    rcu_read_lock(&domlist_read_lock);
> +
> +    for_each_domain ( d )
> +        set_bit(d->domain_id, dom_state_changed);

d is used only here, so could be pointer-to-const?

> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -1296,6 +1296,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          rc = evtchn_bind_virq(&bind_virq, 0);
>          if ( !rc && __copy_to_guest(arg, &bind_virq, 1) )
>              rc = -EFAULT; /* Cleaning up here would be a mess! */
> +        if ( !rc && bind_virq.virq == VIRQ_DOM_EXC )
> +            domain_reset_states();

evtchn_bind_virq() isn't static, so callers beyond the present ones could
appear without noticing the need for this special casing. Is there a reason
the check can't move into the function? Doing the check in spite of the
copy-out failing is imo still reasonable behavior.

Jan
Jürgen Groß Nov. 1, 2024, 6:48 a.m. UTC | #2
On 31.10.24 11:59, Jan Beulich wrote:
> On 23.10.2024 15:10, Juergen Gross wrote:
>> Add a bitmap with one bit per possible domid indicating the respective
>> domain has changed its state (created, deleted, dying, crashed,
>> shutdown).
>>
>> Registering the VIRQ_DOM_EXC event will result in setting the bits for
>> all existing domains and resetting all other bits.
> 
> That's furthering the "there can be only one consumer" model that also
> is used for VIRQ_DOM_EXC itself. I consider the existing model flawed
> (nothing keeps a 2nd party with sufficient privilege from invoking
> XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification
> from whoever had first requested it), and hence I dislike this being
> extended. Conceivably multiple parties may indeed be interested in
> this kind of information. At which point resetting state when the vIRQ
> is bound is questionable (or the data would need to become per-domain
> rather than global, or even yet more fine-grained, albeit
> ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s).

The bitmap is directly tied to the VIRQ_DOM_EXC anyway, as it is that
event which makes the consumer look into the bitmap via the new hypercall.

If we decide to allow multiple consumers of VIRQ_DOM_EXC, we'll need to
have one bitmap per consumer of the event. This is not very hard to
modify.

If you'd like that better, I can dynamically allocate the bitmap on
binding VIRQ_DOM_EXC and freeing it again when unbinding is done.

> 
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -138,6 +138,22 @@ bool __read_mostly vmtrace_available;
>>   
>>   bool __read_mostly vpmu_is_available;
>>   
>> +static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);
> 
> While it won't alter the size of the array, I think DOMID_FIRST_RESERVED
> would be more logical to use here and ...
> 
>> +void domain_reset_states(void)
>> +{
>> +    struct domain *d;
>> +
>> +    bitmap_zero(dom_state_changed, DOMID_MASK + 1);
> 
> ... here.

Fine with me.

> 
>> +    rcu_read_lock(&domlist_read_lock);
>> +
>> +    for_each_domain ( d )
>> +        set_bit(d->domain_id, dom_state_changed);
> 
> d is used only here, so could be pointer-to-const?

Agreed.

> 
>> --- a/xen/common/event_channel.c
>> +++ b/xen/common/event_channel.c
>> @@ -1296,6 +1296,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>           rc = evtchn_bind_virq(&bind_virq, 0);
>>           if ( !rc && __copy_to_guest(arg, &bind_virq, 1) )
>>               rc = -EFAULT; /* Cleaning up here would be a mess! */
>> +        if ( !rc && bind_virq.virq == VIRQ_DOM_EXC )
>> +            domain_reset_states();
> 
> evtchn_bind_virq() isn't static, so callers beyond the present ones could
> appear without noticing the need for this special casing. Is there a reason
> the check can't move into the function? Doing the check in spite of the
> copy-out failing is imo still reasonable behavior.

Moving the test into evtchn_bind_virq() should work. I'll change that.


Juergen
Jan Beulich Nov. 4, 2024, 9:35 a.m. UTC | #3
On 01.11.2024 07:48, Jürgen Groß wrote:
> On 31.10.24 11:59, Jan Beulich wrote:
>> On 23.10.2024 15:10, Juergen Gross wrote:
>>> Add a bitmap with one bit per possible domid indicating the respective
>>> domain has changed its state (created, deleted, dying, crashed,
>>> shutdown).
>>>
>>> Registering the VIRQ_DOM_EXC event will result in setting the bits for
>>> all existing domains and resetting all other bits.
>>
>> That's furthering the "there can be only one consumer" model that also
>> is used for VIRQ_DOM_EXC itself. I consider the existing model flawed
>> (nothing keeps a 2nd party with sufficient privilege from invoking
>> XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification
>> from whoever had first requested it), and hence I dislike this being
>> extended. Conceivably multiple parties may indeed be interested in
>> this kind of information. At which point resetting state when the vIRQ
>> is bound is questionable (or the data would need to become per-domain
>> rather than global, or even yet more fine-grained, albeit
>> ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s).
> 
> The bitmap is directly tied to the VIRQ_DOM_EXC anyway, as it is that
> event which makes the consumer look into the bitmap via the new hypercall.
> 
> If we decide to allow multiple consumers of VIRQ_DOM_EXC, we'll need to
> have one bitmap per consumer of the event. This is not very hard to
> modify.
> 
> If you'd like that better, I can dynamically allocate the bitmap on
> binding VIRQ_DOM_EXC and freeing it again when unbinding is done.

I'd prefer that indeed, yet I'm also curious what other maintainers think.

Jan
diff mbox series

Patch

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 3948640fb0..61b7899cb8 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -138,6 +138,22 @@  bool __read_mostly vmtrace_available;
 
 bool __read_mostly vpmu_is_available;
 
+static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);
+
+void domain_reset_states(void)
+{
+    struct domain *d;
+
+    bitmap_zero(dom_state_changed, DOMID_MASK + 1);
+
+    rcu_read_lock(&domlist_read_lock);
+
+    for_each_domain ( d )
+        set_bit(d->domain_id, dom_state_changed);
+
+    rcu_read_unlock(&domlist_read_lock);
+}
+
 static void __domain_finalise_shutdown(struct domain *d)
 {
     struct vcpu *v;
@@ -152,6 +168,7 @@  static void __domain_finalise_shutdown(struct domain *d)
             return;
 
     d->is_shut_down = 1;
+    set_bit(d->domain_id, dom_state_changed);
     if ( (d->shutdown_code == SHUTDOWN_suspend) && d->suspend_evtchn )
         evtchn_send(d, d->suspend_evtchn);
     else
@@ -832,6 +849,7 @@  struct domain *domain_create(domid_t domid,
      */
     domlist_insert(d);
 
+    set_bit(d->domain_id, dom_state_changed);
     memcpy(d->handle, config->handle, sizeof(d->handle));
 
     return d;
@@ -1097,6 +1115,7 @@  int domain_kill(struct domain *d)
         /* Mem event cleanup has to go here because the rings 
          * have to be put before we call put_domain. */
         vm_event_cleanup(d);
+        set_bit(d->domain_id, dom_state_changed);
         put_domain(d);
         send_global_virq(VIRQ_DOM_EXC);
         /* fallthrough */
@@ -1286,6 +1305,8 @@  static void cf_check complete_domain_destroy(struct rcu_head *head)
 
     xfree(d->vcpu);
 
+    set_bit(d->domain_id, dom_state_changed);
+
     _domain_destroy(d);
 
     send_global_virq(VIRQ_DOM_EXC);
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 8db2ca4ba2..9b87d29968 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1296,6 +1296,8 @@  long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         rc = evtchn_bind_virq(&bind_virq, 0);
         if ( !rc && __copy_to_guest(arg, &bind_virq, 1) )
             rc = -EFAULT; /* Cleaning up here would be a mess! */
+        if ( !rc && bind_virq.virq == VIRQ_DOM_EXC )
+            domain_reset_states();
         break;
     }
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 1dd8a425f9..667863263d 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -800,6 +800,8 @@  void domain_resume(struct domain *d);
 
 int domain_soft_reset(struct domain *d, bool resuming);
 
+void domain_reset_states(void);
+
 int vcpu_start_shutdown_deferral(struct vcpu *v);
 void vcpu_end_shutdown_deferral(struct vcpu *v);