diff mbox

[1/2] vm_event: sync domctl

Message ID 1450882432-10484-1-git-send-email-tamas@tklengyel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tamas K Lengyel Dec. 23, 2015, 2:53 p.m. UTC
Introduce new vm_event domctl option which allows an event subscriber
to request all vCPUs not currently pending a vm_event request to be paused,
thus allowing the subscriber to sync up on the state of the domain. This
is especially useful when the subscribed wants to disable certain events
from being delivered and wants to ensure no more requests are pending on the
ring before doing so.

Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
---
 tools/libxc/include/xenctrl.h | 11 +++++++++++
 tools/libxc/xc_vm_event.c     | 16 ++++++++++++++++
 xen/common/vm_event.c         | 23 +++++++++++++++++++++++
 xen/include/public/domctl.h   | 14 +++++++++++++-
 4 files changed, 63 insertions(+), 1 deletion(-)

Comments

Razvan Cojocaru Dec. 23, 2015, 3:41 p.m. UTC | #1
On 12/23/2015 04:53 PM, Tamas K Lengyel wrote:
> Introduce new vm_event domctl option which allows an event subscriber
> to request all vCPUs not currently pending a vm_event request to be paused,
> thus allowing the subscriber to sync up on the state of the domain. This
> is especially useful when the subscribed wants to disable certain events
> from being delivered and wants to ensure no more requests are pending on the
> ring before doing so.
> 
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>

This certainly looks very interesting. Would xc_domain_pause() not be
enough for your use case then?


Thanks,
Razvan
Andrew Cooper Dec. 23, 2015, 5:17 p.m. UTC | #2
On 23/12/2015 15:41, Razvan Cojocaru wrote:
> On 12/23/2015 04:53 PM, Tamas K Lengyel wrote:
>> Introduce new vm_event domctl option which allows an event subscriber
>> to request all vCPUs not currently pending a vm_event request to be paused,
>> thus allowing the subscriber to sync up on the state of the domain. This
>> is especially useful when the subscribed wants to disable certain events
>> from being delivered and wants to ensure no more requests are pending on the
>> ring before doing so.
>>
>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>> Cc: Ian Campbell <ian.campbell@citrix.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
> This certainly looks very interesting. Would xc_domain_pause() not be
> enough for your use case then?

I second this query.  I would have thought xc_domain_pause() does
exactly what you want in this case.

The code provided is racy, as it is liable to alter which pause
references it takes/releases depending on what other pause/unpause
actions are being made.

~Andrew
Tamas K Lengyel Dec. 23, 2015, 6:11 p.m. UTC | #3
On Wed, Dec 23, 2015 at 6:17 PM, Andrew Cooper <andrew.cooper3@citrix.com>
wrote:

> On 23/12/2015 15:41, Razvan Cojocaru wrote:
> > On 12/23/2015 04:53 PM, Tamas K Lengyel wrote:
> >> Introduce new vm_event domctl option which allows an event subscriber
> >> to request all vCPUs not currently pending a vm_event request to be
> paused,
> >> thus allowing the subscriber to sync up on the state of the domain. This
> >> is especially useful when the subscribed wants to disable certain events
> >> from being delivered and wants to ensure no more requests are pending
> on the
> >> ring before doing so.
> >>
> >> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> >> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> >> Cc: Ian Campbell <ian.campbell@citrix.com>
> >> Cc: Wei Liu <wei.liu2@citrix.com>
> >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> >> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
> > This certainly looks very interesting. Would xc_domain_pause() not be
> > enough for your use case then?
>
> I second this query.  I would have thought xc_domain_pause() does
> exactly what you want in this case.
>

The problem is in what order the responses are processed. I may not be
correct about the logic but here is what my impression was:
xc_domain_unpause resumes all vCPUs even if there is still a vm_event
response that has not been processed. Now, if the subscriber set response
flags (altp2m switch, singlestep toggle, etc) those actions would not be
properly performed on the vCPU before it's resumed. If the subscriber
processes all requests and signals via the event channel that the responses
are on the ring, then calls xc_domain_unpause, we can still have a race
between processing the responses from the ring and unpausing the vCPU.


> The code provided is racy, as it is liable to alter which pause
> references it takes/releases depending on what other pause/unpause
> actions are being made.
>

It's understood that the user would not use xc_domain_pause/unpause while
using vm_event responses with response flags specified. Even then, it was
already racy IMHO if the user called xc_domain_unpause before processing
requests from the vm_event ring that originally paused the vCPU, so this
doesn't change that situation.

Tamas
Razvan Cojocaru Dec. 23, 2015, 7:11 p.m. UTC | #4
On 12/23/2015 08:11 PM, Tamas K Lengyel wrote:
> 
> 
> On Wed, Dec 23, 2015 at 6:17 PM, Andrew Cooper
> <andrew.cooper3@citrix.com <mailto:andrew.cooper3@citrix.com>> wrote:
> 
>     On 23/12/2015 15:41, Razvan Cojocaru wrote:
>     > On 12/23/2015 04:53 PM, Tamas K Lengyel wrote:
>     >> Introduce new vm_event domctl option which allows an event subscriber
>     >> to request all vCPUs not currently pending a vm_event request to be paused,
>     >> thus allowing the subscriber to sync up on the state of the domain. This
>     >> is especially useful when the subscribed wants to disable certain events
>     >> from being delivered and wants to ensure no more requests are pending on the
>     >> ring before doing so.
>     >>
>     >> Cc: Ian Jackson <ian.jackson@eu.citrix.com <mailto:ian.jackson@eu.citrix.com>>
>     >> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com
>     <mailto:stefano.stabellini@eu.citrix.com>>
>     >> Cc: Ian Campbell <ian.campbell@citrix.com <mailto:ian.campbell@citrix.com>>
>     >> Cc: Wei Liu <wei.liu2@citrix.com <mailto:wei.liu2@citrix.com>>
>     >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com <mailto:rcojocaru@bitdefender.com>>
>     >> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com <mailto:tamas@tklengyel.com>>
>     > This certainly looks very interesting. Would xc_domain_pause() not be
>     > enough for your use case then?
> 
>     I second this query.  I would have thought xc_domain_pause() does
>     exactly what you want in this case.
> 
> 
> The problem is in what order the responses are processed. I may not be
> correct about the logic but here is what my impression was:
> xc_domain_unpause resumes all vCPUs even if there is still a vm_event
> response that has not been processed. Now, if the subscriber set
> response flags (altp2m switch, singlestep toggle, etc) those actions
> would not be properly performed on the vCPU before it's resumed. If the
> subscriber processes all requests and signals via the event channel that
> the responses are on the ring, then calls xc_domain_unpause, we can
> still have a race between processing the responses from the ring and
> unpausing the vCPU.
>  
> 
>     The code provided is racy, as it is liable to alter which pause
>     references it takes/releases depending on what other pause/unpause
>     actions are being made.
> 
> 
> It's understood that the user would not use xc_domain_pause/unpause
> while using vm_event responses with response flags specified. Even then,
> it was already racy IMHO if the user called xc_domain_unpause before
> processing requests from the vm_event ring that originally paused the
> vCPU, so this doesn't change that situation.

There are a bunch of checks in vcpu_wake() (xen/common/schedule.c) that
I've always assumed guard against the problem you're describing. I may
be wrong (I don't have any experience with the scheduling code), but
even if I am, I still think having xc_domain_pause() /
xc_domain_unpause() behave correctly is better than adding a new libxc
function. Is that an unreasonable goal?


Thanks,
Razvan
Andrew Cooper Dec. 23, 2015, 7:14 p.m. UTC | #5
On 23/12/2015 18:11, Tamas K Lengyel wrote:
>
>
> On Wed, Dec 23, 2015 at 6:17 PM, Andrew Cooper
> <andrew.cooper3@citrix.com <mailto:andrew.cooper3@citrix.com>> wrote:
>
>     On 23/12/2015 15:41, Razvan Cojocaru wrote:
>     > On 12/23/2015 04:53 PM, Tamas K Lengyel wrote:
>     >> Introduce new vm_event domctl option which allows an event
>     subscriber
>     >> to request all vCPUs not currently pending a vm_event request
>     to be paused,
>     >> thus allowing the subscriber to sync up on the state of the
>     domain. This
>     >> is especially useful when the subscribed wants to disable
>     certain events
>     >> from being delivered and wants to ensure no more requests are
>     pending on the
>     >> ring before doing so.
>     >>
>     >> Cc: Ian Jackson <ian.jackson@eu.citrix.com
>     <mailto:ian.jackson@eu.citrix.com>>
>     >> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com
>     <mailto:stefano.stabellini@eu.citrix.com>>
>     >> Cc: Ian Campbell <ian.campbell@citrix.com
>     <mailto:ian.campbell@citrix.com>>
>     >> Cc: Wei Liu <wei.liu2@citrix.com <mailto:wei.liu2@citrix.com>>
>     >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com
>     <mailto:rcojocaru@bitdefender.com>>
>     >> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com
>     <mailto:tamas@tklengyel.com>>
>     > This certainly looks very interesting. Would xc_domain_pause()
>     not be
>     > enough for your use case then?
>
>     I second this query.  I would have thought xc_domain_pause() does
>     exactly what you want in this case.
>
>
> The problem is in what order the responses are processed. I may not be
> correct about the logic but here is what my impression was:
> xc_domain_unpause resumes all vCPUs even if there is still a vm_event
> response that has not been processed. Now, if the subscriber set
> response flags (altp2m switch, singlestep toggle, etc) those actions
> would not be properly performed on the vCPU before it's resumed. If
> the subscriber processes all requests and signals via the event
> channel that the responses are on the ring, then calls
> xc_domain_unpause, we can still have a race between processing the
> responses from the ring and unpausing the vCPU.
>  
>
>     The code provided is racy, as it is liable to alter which pause
>     references it takes/releases depending on what other pause/unpause
>     actions are being made.
>
>
> It's understood that the user would not use xc_domain_pause/unpause
> while using vm_event responses with response flags specified. Even
> then, it was already racy IMHO if the user called xc_domain_unpause
> before processing requests from the vm_event ring that originally
> paused the vCPU, so this doesn't change that situation.

Pausing is strictly reference counted. (or rather, it is since c/s
3eb1c70 "properly reference count DOMCTL_{,un}pausedomain hypercalls". 
Before then, it definitely was buggy.)

There is the domain pause count, and pause counts per vcpu.  All domain
pause operations take both a domain pause reference, and a vcpu pause
reference on each vcpu.  A vcpu is only eligible to be scheduled if its
pause reference count is zero.  If two independent tasks call
vcpu_pause() on the same vcpu, it will remain paused until both
independent tasks have called vcpu_unpause().

Having said this, I can well believe that there might be issues with the
current uses of pausing.

The vital factor is that the entity which pauses a vcpu is also
responsible for unpausing it, and it must be resistant to accidentally
leaking its reference.

In this case, I believe that what you want to do is:

1) Identify condition requiring a sync
2) xc_domain_pause()
3) Process all of the pending vm_events
4) Synchronise the state
5) xc_domain_unpause()

All vcpus of the domain should stay descheduled between points 2 and 5. 
If this doesn't have the intended effect, then I suspect there is a bug
in the pause reference handing of the vm_event subsystem.

Is this clearer, or have I misunderstood the problem?

~Andrew
Tamas K Lengyel Dec. 23, 2015, 8:55 p.m. UTC | #6
On Wed, Dec 23, 2015 at 8:14 PM, Andrew Cooper <andrew.cooper3@citrix.com>
wrote:

> On 23/12/2015 18:11, Tamas K Lengyel wrote:
>
>
>
> On Wed, Dec 23, 2015 at 6:17 PM, Andrew Cooper <andrew.cooper3@citrix.com>
> wrote:
>
>> On 23/12/2015 15:41, Razvan Cojocaru wrote:
>> > On 12/23/2015 04:53 PM, Tamas K Lengyel wrote:
>> >> Introduce new vm_event domctl option which allows an event subscriber
>> >> to request all vCPUs not currently pending a vm_event request to be
>> paused,
>> >> thus allowing the subscriber to sync up on the state of the domain.
>> This
>> >> is especially useful when the subscribed wants to disable certain
>> events
>> >> from being delivered and wants to ensure no more requests are pending
>> on the
>> >> ring before doing so.
>> >>
>> >> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>> >> Cc: Stefano Stabellini < <stefano.stabellini@eu.citrix.com>
>> stefano.stabellini@eu.citrix.com>
>> >> Cc: Ian Campbell <ian.campbell@citrix.com>
>> >> Cc: Wei Liu <wei.liu2@citrix.com>
>> >> Cc: Razvan Cojocaru < <rcojocaru@bitdefender.com>
>> rcojocaru@bitdefender.com>
>> >> Signed-off-by: Tamas K Lengyel < <tamas@tklengyel.com>
>> tamas@tklengyel.com>
>> > This certainly looks very interesting. Would xc_domain_pause() not be
>> > enough for your use case then?
>>
>> I second this query.  I would have thought xc_domain_pause() does
>> exactly what you want in this case.
>>
>
> The problem is in what order the responses are processed. I may not be
> correct about the logic but here is what my impression was:
> xc_domain_unpause resumes all vCPUs even if there is still a vm_event
> response that has not been processed. Now, if the subscriber set response
> flags (altp2m switch, singlestep toggle, etc) those actions would not be
> properly performed on the vCPU before it's resumed. If the subscriber
> processes all requests and signals via the event channel that the responses
> are on the ring, then calls xc_domain_unpause, we can still have a race
> between processing the responses from the ring and unpausing the vCPU.
>
>
>> The code provided is racy, as it is liable to alter which pause
>> references it takes/releases depending on what other pause/unpause
>> actions are being made.
>>
>
> It's understood that the user would not use xc_domain_pause/unpause while
> using vm_event responses with response flags specified. Even then, it was
> already racy IMHO if the user called xc_domain_unpause before processing
> requests from the vm_event ring that originally paused the vCPU, so this
> doesn't change that situation.
>
>
> Pausing is strictly reference counted. (or rather, it is since c/s 3eb1c70
> "properly reference count DOMCTL_{,un}pausedomain hypercalls".  Before
> then, it definitely was buggy.)
>
> There is the domain pause count, and pause counts per vcpu.  All domain
> pause operations take both a domain pause reference, and a vcpu pause
> reference on each vcpu.  A vcpu is only eligible to be scheduled if its
> pause reference count is zero.  If two independent tasks call vcpu_pause()
> on the same vcpu, it will remain paused until both independent tasks have
> called vcpu_unpause().
>
> Having said this, I can well believe that there might be issues with the
> current uses of pausing.
>
> The vital factor is that the entity which pauses a vcpu is also
> responsible for unpausing it, and it must be resistant to accidentally
> leaking its reference.
>
> In this case, I believe that what you want to do is:
>
> 1) Identify condition requiring a sync
> 2) xc_domain_pause()
> 3) Process all of the pending vm_events
> 4) Synchronise the state
> 5) xc_domain_unpause()
>
> All vcpus of the domain should stay descheduled between points 2 and 5.
> If this doesn't have the intended effect, then I suspect there is a bug in
> the pause reference handing of the vm_event subsystem.
>
> Is this clearer, or have I misunderstood the problem?
>

The problem is with step 4&5 IMHO. The event channel notification AFAIK is
asynchronous in that it just starts the processing of the pending vm_event
responses and returns, doesn't wait for the responses to be all processed.
Now if we progress to step 5, we might still have some responses on the
ring which have not gotten processed yet, so there is a race-condition.
There is currently no way to get a notification when all responses have
been processed, so the best thing we can do is to make sure we can
pause/unpause vCPUs without pending requests/responses as those are safe to
be resumed, while restricting the other vCPUs to be only unpaused through
the pending vm_event response. I hope this make sense.

That being said, I haven't yet encountered an instance where this
racecondition was lost, so this is just a pre-caution.

Tamas
Tamas K Lengyel Dec. 23, 2015, 9:06 p.m. UTC | #7
On Wed, Dec 23, 2015 at 9:55 PM, Tamas K Lengyel <tamas@tklengyel.com>
wrote:

>
>
>
> On Wed, Dec 23, 2015 at 8:14 PM, Andrew Cooper <andrew.cooper3@citrix.com>
> wrote:
>
>> On 23/12/2015 18:11, Tamas K Lengyel wrote:
>>
>>
>>
>> On Wed, Dec 23, 2015 at 6:17 PM, Andrew Cooper <andrew.cooper3@citrix.com
>> > wrote:
>>
>>> On 23/12/2015 15:41, Razvan Cojocaru wrote:
>>> > On 12/23/2015 04:53 PM, Tamas K Lengyel wrote:
>>> >> Introduce new vm_event domctl option which allows an event subscriber
>>> >> to request all vCPUs not currently pending a vm_event request to be
>>> paused,
>>> >> thus allowing the subscriber to sync up on the state of the domain.
>>> This
>>> >> is especially useful when the subscribed wants to disable certain
>>> events
>>> >> from being delivered and wants to ensure no more requests are pending
>>> on the
>>> >> ring before doing so.
>>> >>
>>> >> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>>> >> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>>> >> Cc: Ian Campbell <ian.campbell@citrix.com>
>>> >> Cc: Wei Liu <wei.liu2@citrix.com>
>>> >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>>> >> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
>>> > This certainly looks very interesting. Would xc_domain_pause() not be
>>> > enough for your use case then?
>>>
>>> I second this query.  I would have thought xc_domain_pause() does
>>> exactly what you want in this case.
>>>
>>
>> The problem is in what order the responses are processed. I may not be
>> correct about the logic but here is what my impression was:
>> xc_domain_unpause resumes all vCPUs even if there is still a vm_event
>> response that has not been processed. Now, if the subscriber set response
>> flags (altp2m switch, singlestep toggle, etc) those actions would not be
>> properly performed on the vCPU before it's resumed. If the subscriber
>> processes all requests and signals via the event channel that the responses
>> are on the ring, then calls xc_domain_unpause, we can still have a race
>> between processing the responses from the ring and unpausing the vCPU.
>>
>>
>>> The code provided is racy, as it is liable to alter which pause
>>> references it takes/releases depending on what other pause/unpause
>>> actions are being made.
>>>
>>
>> It's understood that the user would not use xc_domain_pause/unpause while
>> using vm_event responses with response flags specified. Even then, it was
>> already racy IMHO if the user called xc_domain_unpause before processing
>> requests from the vm_event ring that originally paused the vCPU, so this
>> doesn't change that situation.
>>
>>
>> Pausing is strictly reference counted. (or rather, it is since c/s
>> 3eb1c70 "properly reference count DOMCTL_{,un}pausedomain hypercalls".
>> Before then, it definitely was buggy.)
>>
>> There is the domain pause count, and pause counts per vcpu.  All domain
>> pause operations take both a domain pause reference, and a vcpu pause
>> reference on each vcpu.  A vcpu is only eligible to be scheduled if its
>> pause reference count is zero.  If two independent tasks call vcpu_pause()
>> on the same vcpu, it will remain paused until both independent tasks have
>> called vcpu_unpause().
>>
>
Actually, I've double-checked and v->pause_count and
v->vm_event_pause_count are both increased for a vm_event request. So you
are right, the reference counting will make sure that v->pause_count > 0
until we process the vm_event response and call xc_domain_unpause. I was
under the impression that wasn't the case. We can ignore this patch.

Thanks and sorry for the noise ;)
Tamas
Andrew Cooper Dec. 23, 2015, 9:13 p.m. UTC | #8
On 23/12/2015 21:06, Tamas K Lengyel wrote:
>
>
> On Wed, Dec 23, 2015 at 9:55 PM, Tamas K Lengyel <tamas@tklengyel.com
> <mailto:tamas@tklengyel.com>> wrote:
>
>
>
>
>     On Wed, Dec 23, 2015 at 8:14 PM, Andrew Cooper
>     <andrew.cooper3@citrix.com <mailto:andrew.cooper3@citrix.com>> wrote:
>
>         On 23/12/2015 18:11, Tamas K Lengyel wrote:
>>
>>
>>         On Wed, Dec 23, 2015 at 6:17 PM, Andrew Cooper
>>         <andrew.cooper3@citrix.com
>>         <mailto:andrew.cooper3@citrix.com>> wrote:
>>
>>             On 23/12/2015 15:41, Razvan Cojocaru wrote:
>>             > On 12/23/2015 04:53 PM, Tamas K Lengyel wrote:
>>             >> Introduce new vm_event domctl option which allows an
>>             event subscriber
>>             >> to request all vCPUs not currently pending a vm_event
>>             request to be paused,
>>             >> thus allowing the subscriber to sync up on the state
>>             of the domain. This
>>             >> is especially useful when the subscribed wants to
>>             disable certain events
>>             >> from being delivered and wants to ensure no more
>>             requests are pending on the
>>             >> ring before doing so.
>>             >>
>>             >> Cc: Ian Jackson <ian.jackson@eu.citrix.com
>>             <mailto:ian.jackson@eu.citrix.com>>
>>             >> Cc: Stefano Stabellini
>>             <stefano.stabellini@eu.citrix.com
>>             <mailto:stefano.stabellini@eu.citrix.com>>
>>             >> Cc: Ian Campbell <ian.campbell@citrix.com
>>             <mailto:ian.campbell@citrix.com>>
>>             >> Cc: Wei Liu <wei.liu2@citrix.com
>>             <mailto:wei.liu2@citrix.com>>
>>             >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com
>>             <mailto:rcojocaru@bitdefender.com>>
>>             >> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com
>>             <mailto:tamas@tklengyel.com>>
>>             > This certainly looks very interesting. Would
>>             xc_domain_pause() not be
>>             > enough for your use case then?
>>
>>             I second this query.  I would have thought
>>             xc_domain_pause() does
>>             exactly what you want in this case.
>>
>>
>>         The problem is in what order the responses are processed. I
>>         may not be correct about the logic but here is what my
>>         impression was: xc_domain_unpause resumes all vCPUs even if
>>         there is still a vm_event response that has not been
>>         processed. Now, if the subscriber set response flags (altp2m
>>         switch, singlestep toggle, etc) those actions would not be
>>         properly performed on the vCPU before it's resumed. If the
>>         subscriber processes all requests and signals via the event
>>         channel that the responses are on the ring, then calls
>>         xc_domain_unpause, we can still have a race between
>>         processing the responses from the ring and unpausing the vCPU.
>>          
>>
>>             The code provided is racy, as it is liable to alter which
>>             pause
>>             references it takes/releases depending on what other
>>             pause/unpause
>>             actions are being made.
>>
>>
>>         It's understood that the user would not use
>>         xc_domain_pause/unpause while using vm_event responses with
>>         response flags specified. Even then, it was already racy IMHO
>>         if the user called xc_domain_unpause before processing
>>         requests from the vm_event ring that originally paused the
>>         vCPU, so this doesn't change that situation.
>
>         Pausing is strictly reference counted. (or rather, it is since
>         c/s 3eb1c70 "properly reference count DOMCTL_{,un}pausedomain
>         hypercalls".  Before then, it definitely was buggy.)
>
>         There is the domain pause count, and pause counts per vcpu. 
>         All domain pause operations take both a domain pause
>         reference, and a vcpu pause reference on each vcpu.  A vcpu is
>         only eligible to be scheduled if its pause reference count is
>         zero.  If two independent tasks call vcpu_pause() on the same
>         vcpu, it will remain paused until both independent tasks have
>         called vcpu_unpause().
>
>
> Actually, I've double-checked and v->pause_count and
> v->vm_event_pause_count are both increased for a vm_event request. So
> you are right, the reference counting will make sure that
> v->pause_count > 0 until we process the vm_event response and call
> xc_domain_unpause. I was under the impression that wasn't the case. We
> can ignore this patch.
>
> Thanks and sorry for the noise ;)

Not a problem at all.  This is complicated stuff, and IMO it was equally
as likely that there was a real bug lurking.

~Andrew
Ian Campbell Jan. 6, 2016, 3:48 p.m. UTC | #9
On Wed, 2015-12-23 at 15:53 +0100, Tamas K Lengyel wrote:
> Introduce new vm_event domctl option which allows an event subscriber
> to request all vCPUs not currently pending a vm_event request to be
> paused,
> thus allowing the subscriber to sync up on the state of the domain. This
> is especially useful when the subscribed wants to disable certain events
> from being delivered and wants to ensure no more requests are pending on
> the
> ring before doing so.
> 
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
> ---
>  tools/libxc/include/xenctrl.h | 11 +++++++++++
>  tools/libxc/xc_vm_event.c     | 16 ++++++++++++++++

Tools side is pretty trivial, assuming there is agreement on the underlying
hypercall interface:

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> +/***
>   * Memory sharing operations.

Do you also maintain this? If so do you fancy sending a patch to fix:

>   *
>   * Unles otherwise noted, these calls return 0 on succes, -1 and errno on

"Unless" and "success" ?

Ian..
Tamas K Lengyel Jan. 6, 2016, 6:29 p.m. UTC | #10
On Wed, Jan 6, 2016 at 4:48 PM, Ian Campbell <ian.campbell@citrix.com>
wrote:

> On Wed, 2015-12-23 at 15:53 +0100, Tamas K Lengyel wrote:
> > Introduce new vm_event domctl option which allows an event subscriber
> > to request all vCPUs not currently pending a vm_event request to be
> > paused,
> > thus allowing the subscriber to sync up on the state of the domain. This
> > is especially useful when the subscribed wants to disable certain events
> > from being delivered and wants to ensure no more requests are pending on
> > the
> > ring before doing so.
> >
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > Cc: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> > Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
> > ---
> >  tools/libxc/include/xenctrl.h | 11 +++++++++++
> >  tools/libxc/xc_vm_event.c     | 16 ++++++++++++++++
>
> Tools side is pretty trivial, assuming there is agreement on the underlying
> hypercall interface:
>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>

Thanks, we've decided that this patch is actually not needed as the pause
reference count is already good enough.


>
> > +/***
> >   * Memory sharing operations.
>
> Do you also maintain this? If so do you fancy sending a patch to fix:
>
> >   *
> >   * Unles otherwise noted, these calls return 0 on succes, -1 and errno
> on
>
> "Unless" and "success" ?
>

Sure, that could be done in a separate patch. IMHO the whole sharing
subsystem could use a cleanup series of its own to fix things like this,
style issues and whatnot.

Tamas
Ian Campbell Jan. 7, 2016, 9:58 a.m. UTC | #11
On Wed, 2016-01-06 at 19:29 +0100, Tamas K Lengyel wrote:
> 
> 
> On Wed, Jan 6, 2016 at 4:48 PM, Ian Campbell <ian.campbell@citrix.com>
> wrote:
> > On Wed, 2015-12-23 at 15:53 +0100, Tamas K Lengyel wrote:
> > > Introduce new vm_event domctl option which allows an event subscriber
> > > to request all vCPUs not currently pending a vm_event request to be
> > > paused,
> > > thus allowing the subscriber to sync up on the state of the domain.
> > This
> > > is especially useful when the subscribed wants to disable certain
> > events
> > > from being delivered and wants to ensure no more requests are pending
> > on
> > > the
> > > ring before doing so.
> > >
> > > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > > Cc: Ian Campbell <ian.campbell@citrix.com>
> > > Cc: Wei Liu <wei.liu2@citrix.com>
> > > Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> > > Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
> > > ---
> > >  tools/libxc/include/xenctrl.h | 11 +++++++++++
> > >  tools/libxc/xc_vm_event.c     | 16 ++++++++++++++++
> > 
> > Tools side is pretty trivial, assuming there is agreement on the
> > underlying
> > hypercall interface:
> > 
> > Acked-by: Ian Campbell <ian.campbell@citrix.com>
> Thanks, we've decided that this patch is actually not needed as the pause
> reference count is already good enough.

OK, thanks.

> > > +/***
> > >   * Memory sharing operations.
> > 
> > Do you also maintain this? If so do you fancy sending a patch to fix:
> > 
> > >   *
> > >   * Unles otherwise noted, these calls return 0 on succes, -1 and
> > errno on
> > 
> > "Unless" and "success" ?
> > 
> Sure, that could be done in a separate patch.

Yes, that's what I intended.

>  IMHO the whole sharing subsystem could use a cleanup series of its own
> to fix things like this, style issues and whatnot.

Ian.
diff mbox

Patch

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 01a6dda..27bb907 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2433,6 +2433,17 @@  int xc_monitor_emulate_each_rep(xc_interface *xch, domid_t domain_id,
                                 bool enable);
 
 /***
+ * xc_vm_event_sync_on can be used by a vm_event subscriber to pause all vCPUs
+ * that do not currently have a pending vm_event request. This allows the
+ * subscriber to sync up on the domain's status and process all outstanding
+ * vm_event requests without any new ones being placed on the ring. A caller
+ * of xc_vm_event_sync_on can resume these vCPUs by calling
+ * xc_vm_event_sync_off.
+ */
+int xc_vm_event_sync_on(xc_interface *xch, domid_t domain_id);
+int xc_vm_event_sync_off(xc_interface *xch, domid_t domain_id);
+
+/***
  * Memory sharing operations.
  *
  * Unles otherwise noted, these calls return 0 on succes, -1 and errno on
diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c
index 2fef96a..6b39908 100644
--- a/tools/libxc/xc_vm_event.c
+++ b/tools/libxc/xc_vm_event.c
@@ -156,3 +156,19 @@  void *xc_vm_event_enable(xc_interface *xch, domid_t domain_id, int param,
 
     return ring_page;
 }
+
+int xc_vm_event_sync_on(xc_interface *xch, domid_t domain_id)
+{
+    return xc_vm_event_control(xch, domain_id,
+                               XEN_VM_EVENT_ENABLE,
+                               XEN_DOMCTL_VM_EVENT_OP_SYNC,
+                               NULL);
+}
+
+int xc_vm_event_sync_off(xc_interface *xch, domid_t domain_id)
+{
+    return xc_vm_event_control(xch, domain_id,
+                               XEN_VM_EVENT_DISABLE,
+                               XEN_DOMCTL_VM_EVENT_OP_SYNC,
+                               NULL);
+}
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 28a7add..b8298bd 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -726,6 +726,29 @@  int vm_event_domctl(struct domain *d, xen_domctl_vm_event_op_t *vec,
     break;
 #endif
 
+    case XEN_DOMCTL_VM_EVENT_OP_SYNC:
+    {
+        struct vcpu *v;
+        rc = 0;
+
+        switch( vec->op )
+        {
+        case XEN_VM_EVENT_ENABLE:
+            for_each_vcpu( d, v )
+                if ( !atomic_read(&v->vm_event_pause_count) )
+                    vcpu_pause(v);
+            break;
+
+        default:
+            for_each_vcpu( d, v )
+                if ( !atomic_read(&v->vm_event_pause_count) )
+                    vcpu_unpause(v);
+            break;
+        };
+    }
+    break;
+
+
     default:
         rc = -ENOSYS;
     }
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 7a56b3f..486c667 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -749,7 +749,8 @@  struct xen_domctl_gdbsx_domstatus {
  * sharing, monitor and paging. This hypercall allows one to
  * control these rings (enable/disable), as well as to signal
  * to the hypervisor to pull responses (resume) from the given
- * ring.
+ * ring. Sync will pause/unpause all vCPUs which don't have
+ * a pending vm_event.
  */
 #define XEN_VM_EVENT_ENABLE               0
 #define XEN_VM_EVENT_DISABLE              1
@@ -810,6 +811,17 @@  struct xen_domctl_gdbsx_domstatus {
  */
 #define XEN_DOMCTL_VM_EVENT_OP_SHARING           3
 
+/*
+ * SYNC is a special vm_event operation where all vCPUs get paused
+ * to allow the toolstack to sync up with the state of the domain,
+ * without any new vm_event requests being produced by the domain
+ * on any of the rings.
+ * When issued with ENABLE all the vCPUs get paused that aren't
+ * already paused for a vm_event request. When issued with DISABLE
+ * or RESUME the vCPUs without a pending vm_event request get unpaused.
+ */
+#define XEN_DOMCTL_VM_EVENT_OP_SYNC           4
+
 /* Use for teardown/setup of helper<->hypervisor interface for paging, 
  * access and sharing.*/
 struct xen_domctl_vm_event_op {