diff mbox

[qemu,2/2] kvmclock: reduce kvmclock difference on migration

Message ID 20161114123700.158592605@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Marcelo Tosatti Nov. 14, 2016, 12:36 p.m. UTC
Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which
indicates that KVM_GET_CLOCK returns a value as seen by the guest at
that moment.

For new machine types, use this value rather than reading 
from guest memory.

This reduces kvmclock difference on migration from 5s to 0.1s
(when max_downtime == 5s).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

---
 hw/i386/kvm/clock.c    |   88 +++++++++++++++++++++++++++++++++++++++++++++++--
 include/hw/i386/pc.h   |    5 ++
 target-i386/kvm.c      |    7 +++
 target-i386/kvm_i386.h |    1 
 4 files changed, 98 insertions(+), 3 deletions(-)

Comments

Paolo Bonzini Nov. 14, 2016, 1:54 p.m. UTC | #1
On 14/11/2016 13:36, Marcelo Tosatti wrote:
> +        /* local (running VM) restore */
> +        if (s->clock_valid) {
> +            /*
> +             * if host does not support reliable KVM_GET_CLOCK,
> +             * read kvmclock value from memory
> +             */
> +            if (!kvm_has_adjust_clock_stable()) {
> +                time_at_migration = kvmclock_current_nsec(s);

Just assign to s->clock here...


> +            }
> +        /* migration/savevm/init restore */
> +        } else {
> +            /*
> +             * use s->clock in case machine uses reliable
> +             * get clock and host where vm was executing
> +             * supported reliable get clock
> +             */
> +            if (!s->mach_use_reliable_get_clock ||
> +                !s->src_use_reliable_get_clock) {
> +                time_at_migration = kvmclock_current_nsec(s);

... and here, so that time_at_migration is not needed anymore.

Also here it's enough to look at s->src_user_reliable_get_clock, because
if s->mach_use_reliable_get_clock is false,
s->src_use_reliable_get_clock will be false as well.

> +            }
> +        }
>  
> -        /* We can't rely on the migrated clock value, just discard it */
> +        /* We can't rely on the saved clock value, just discard it */
>          if (time_at_migration) {
>              s->clock = time_at_migration;

[...]

> 
> +static bool kvmclock_src_use_reliable_get_clock(void *opaque)
> +{
> +    KVMClockState *s = opaque;
> +
> +    /*
> +     * On machine types that support reliable KVM_GET_CLOCK,
> +     * if host kernel does provide reliable KVM_GET_CLOCK,
> +     * set src_use_reliable_get_clock=true so that destination
> +     * avoids reading kvmclock from memory.
> +     */
> +    if (s->mach_use_reliable_get_clock && kvm_has_adjust_clock_stable()) {
> +        s->src_use_reliable_get_clock = true;
> +    }
> +
> +    return s->src_use_reliable_get_clock;
> +}

Here you can just return s->mach_use_reliable_get_clock.  To set
s->src_use_reliable_get_clock, after issuing KVM_GET_CLOCK you can look
at the KVM_CLOCK_TSC_STABLE bit in the kvm_clock struct's flags.

You don't actually need kvm_has_adjust_clock_stable(), but please place
the KVM_GET_CLOCK code for kvmclock_pre_save and
kvmclock_vm_state_change in a common function.

Also, just another small nit: please make your scripts use the "-p"
option on diff. :)

Thanks,

Paolo
Marcelo Tosatti Nov. 14, 2016, 2 p.m. UTC | #2
On Mon, Nov 14, 2016 at 02:54:38PM +0100, Paolo Bonzini wrote:
> 
> 
> On 14/11/2016 13:36, Marcelo Tosatti wrote:
> > +        /* local (running VM) restore */
> > +        if (s->clock_valid) {
> > +            /*
> > +             * if host does not support reliable KVM_GET_CLOCK,
> > +             * read kvmclock value from memory
> > +             */
> > +            if (!kvm_has_adjust_clock_stable()) {
> > +                time_at_migration = kvmclock_current_nsec(s);
> 
> Just assign to s->clock here...

If kvmclock is not enabled, you want to use s->clock,
rather than 0.

> > +            }
> > +        /* migration/savevm/init restore */
> > +        } else {
> > +            /*
> > +             * use s->clock in case machine uses reliable
> > +             * get clock and host where vm was executing
> > +             * supported reliable get clock
> > +             */
> > +            if (!s->mach_use_reliable_get_clock ||
> > +                !s->src_use_reliable_get_clock) {
> > +                time_at_migration = kvmclock_current_nsec(s);
> 
> ... and here, so that time_at_migration is not needed anymore.

Same as above.

> Also here it's enough to look at s->src_user_reliable_get_clock, because
> if s->mach_use_reliable_get_clock is false,
> s->src_use_reliable_get_clock will be false as well.

Yes, but i like the code annotation.

> > +            }
> > +        }
> >  
> > -        /* We can't rely on the migrated clock value, just discard it */
> > +        /* We can't rely on the saved clock value, just discard it */
> >          if (time_at_migration) {
> >              s->clock = time_at_migration;
> 
> [...]
> 
> > 
> > +static bool kvmclock_src_use_reliable_get_clock(void *opaque)
> > +{
> > +    KVMClockState *s = opaque;
> > +
> > +    /*
> > +     * On machine types that support reliable KVM_GET_CLOCK,
> > +     * if host kernel does provide reliable KVM_GET_CLOCK,
> > +     * set src_use_reliable_get_clock=true so that destination
> > +     * avoids reading kvmclock from memory.
> > +     */
> > +    if (s->mach_use_reliable_get_clock && kvm_has_adjust_clock_stable()) {
> > +        s->src_use_reliable_get_clock = true;
> > +    }
> > +
> > +    return s->src_use_reliable_get_clock;
> > +}
> 
> Here you can just return s->mach_use_reliable_get_clock. 

mach_use_reliable_get_clock can be true but host might not support it.

>  To set
> s->src_use_reliable_get_clock, after issuing KVM_GET_CLOCK you can look
> at the KVM_CLOCK_TSC_STABLE bit in the kvm_clock struct's flags.

KVM_CLOCK_TSC_STABLE bit in the kvmclock structure != 
KVM_GET_CLOCK returns reliable value, right?


> You don't actually need kvm_has_adjust_clock_stable(), but please place
> the KVM_GET_CLOCK code for kvmclock_pre_save and
> kvmclock_vm_state_change in a common function.

Sure.

> Also, just another small nit: please make your scripts use the "-p"
> option on diff. :)

Sure.

> 
> Thanks,
> 
> Paolo
Juan Quintela Nov. 14, 2016, 2:09 p.m. UTC | #3
Marcelo Tosatti <mtosatti@redhat.com> wrote:
> Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which
> indicates that KVM_GET_CLOCK returns a value as seen by the guest at
> that moment.
>
> For new machine types, use this value rather than reading 
> from guest memory.
>
> This reduces kvmclock difference on migration from 5s to 0.1s
> (when max_downtime == 5s).
>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Acked-by: Juan Quintela <quintela@redhat.com>

But, if you have to respin it ....


> +    /* whether machine supports reliable KVM_GET_CLOCK */
> +    bool mach_use_reliable_get_clock;
> +
> +    /* whether source host supported reliable KVM_GET_CLOCK */
> +    bool src_use_reliable_get_clock;

This two names are really long, but I don't have better suggesitons :-()
>      if (running) {
>          struct kvm_clock_data data = {};
> -        uint64_t time_at_migration = kvmclock_current_nsec(s);
> +        uint64_t time_at_migration = 0;

This was not "time_at_migration", it was not already before, but just
now it looks really weird. (as it was already faulty, this is why it is
only a suggestion.)

>  
> -        s->clock_valid = false;
> +        /* local (running VM) restore */
> +        if (s->clock_valid) {
> +            /*
> +             * if host does not support reliable KVM_GET_CLOCK,
> +             * read kvmclock value from memory
> +             */
> +            if (!kvm_has_adjust_clock_stable()) {
> +                time_at_migration = kvmclock_current_nsec(s);
> +            }
> +        /* migration/savevm/init restore */
> +        } else {
> +            /*
> +             * use s->clock in case machine uses reliable
> +             * get clock and host where vm was executing
> +             * supported reliable get clock
> +             */

This comment is just weird.  Simplifying
    /* If A and B do C */
    if (!A and || !B) {
       then D();
    }

Doing the opposite comment?

Migration code looks rigth.

Once said that, I continue hating clocks.

Later, Juan.
Juan Quintela Nov. 14, 2016, 2:11 p.m. UTC | #4
Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 14/11/2016 13:36, Marcelo Tosatti wrote:
>> +        /* local (running VM) restore */
>> +        if (s->clock_valid) {
>> +            /*
>> +             * if host does not support reliable KVM_GET_CLOCK,
>> +             * read kvmclock value from memory
>> +             */
>> +            if (!kvm_has_adjust_clock_stable()) {
>> +                time_at_migration = kvmclock_current_nsec(s);
>
> Just assign to s->clock here...

Agreed, I just wanted to make so many changes.


> Also, just another small nit: please make your scripts use the "-p"
> option on diff. :)

YESSSS

I was searching what functions the code belonged for :p

Thanks, Juan.
Paolo Bonzini Nov. 14, 2016, 2:22 p.m. UTC | #5
On 14/11/2016 15:00, Marcelo Tosatti wrote:
> On Mon, Nov 14, 2016 at 02:54:38PM +0100, Paolo Bonzini wrote:
>>
>>
>> On 14/11/2016 13:36, Marcelo Tosatti wrote:
>>> +        /* local (running VM) restore */
>>> +        if (s->clock_valid) {
>>> +            /*
>>> +             * if host does not support reliable KVM_GET_CLOCK,
>>> +             * read kvmclock value from memory
>>> +             */
>>> +            if (!kvm_has_adjust_clock_stable()) {
>>> +                time_at_migration = kvmclock_current_nsec(s);
>>
>> Just assign to s->clock here...
> 
> If kvmclock is not enabled, you want to use s->clock,
> rather than 0.
> 
>>> +            }
>>> +        /* migration/savevm/init restore */
>>> +        } else {
>>> +            /*
>>> +             * use s->clock in case machine uses reliable
>>> +             * get clock and host where vm was executing
>>> +             * supported reliable get clock
>>> +             */
>>> +            if (!s->mach_use_reliable_get_clock ||
>>> +                !s->src_use_reliable_get_clock) {
>>> +                time_at_migration = kvmclock_current_nsec(s);
>>
>> ... and here, so that time_at_migration is not needed anymore.
> 
> Same as above.

You're right.

>> Also here it's enough to look at s->src_user_reliable_get_clock, because
>> if s->mach_use_reliable_get_clock is false,
>> s->src_use_reliable_get_clock will be false as well.
> 
> Yes, but i like the code annotation.

Ah, I think we're looking at it differently.

I'm thinking "mach_use_reliable_get_clock is just for migration,
src_use_reliable_get_clock is the state".  Perhaps you're thinking of
enabling/disabling the whole new code for old machines?  What is the
advantage?

>>> +            }
>>> +        }
>>>  
>>> -        /* We can't rely on the migrated clock value, just discard it */
>>> +        /* We can't rely on the saved clock value, just discard it */
>>>          if (time_at_migration) {
>>>              s->clock = time_at_migration;
>>
>> [...]
>>
>>>
>>> +static bool kvmclock_src_use_reliable_get_clock(void *opaque)
>>> +{
>>> +    KVMClockState *s = opaque;
>>> +
>>> +    /*
>>> +     * On machine types that support reliable KVM_GET_CLOCK,
>>> +     * if host kernel does provide reliable KVM_GET_CLOCK,
>>> +     * set src_use_reliable_get_clock=true so that destination
>>> +     * avoids reading kvmclock from memory.
>>> +     */
>>> +    if (s->mach_use_reliable_get_clock && kvm_has_adjust_clock_stable()) {
>>> +        s->src_use_reliable_get_clock = true;
>>> +    }
>>> +
>>> +    return s->src_use_reliable_get_clock;
>>> +}
>>
>> Here you can just return s->mach_use_reliable_get_clock. 
> 
> mach_use_reliable_get_clock can be true but host might not support it.

Yes, but the "needed" function is only required to avoid breaking
pc-i440fx-2.7 and earlier.  If you return true here, you can still
migrate a "false" value for src_use_reliable_get_clock.

>>  To set
>> s->src_use_reliable_get_clock, after issuing KVM_GET_CLOCK you can look
>> at the KVM_CLOCK_TSC_STABLE bit in the kvm_clock struct's flags.
> 
> KVM_CLOCK_TSC_STABLE bit in the kvmclock structure != 
> KVM_GET_CLOCK returns reliable value, right?

It is the same as "is using masterclock", which is actually a stricter
condition than the KVM_CHECK_EXTENSION return value.  The right check to
use is whether masterclock is in use, and then the idea is to treat
clock,src_use_reliable_get_clock as one tuple that is updated atomically.

Paolo
Marcelo Tosatti Nov. 14, 2016, 2:50 p.m. UTC | #6
On Mon, Nov 14, 2016 at 03:22:02PM +0100, Paolo Bonzini wrote:
> 
> 
> On 14/11/2016 15:00, Marcelo Tosatti wrote:
> > On Mon, Nov 14, 2016 at 02:54:38PM +0100, Paolo Bonzini wrote:
> >>
> >>
> >> On 14/11/2016 13:36, Marcelo Tosatti wrote:
> >>> +        /* local (running VM) restore */
> >>> +        if (s->clock_valid) {
> >>> +            /*
> >>> +             * if host does not support reliable KVM_GET_CLOCK,
> >>> +             * read kvmclock value from memory
> >>> +             */
> >>> +            if (!kvm_has_adjust_clock_stable()) {
> >>> +                time_at_migration = kvmclock_current_nsec(s);
> >>
> >> Just assign to s->clock here...
> > 
> > If kvmclock is not enabled, you want to use s->clock,
> > rather than 0.
> > 
> >>> +            }
> >>> +        /* migration/savevm/init restore */
> >>> +        } else {
> >>> +            /*
> >>> +             * use s->clock in case machine uses reliable
> >>> +             * get clock and host where vm was executing
> >>> +             * supported reliable get clock
> >>> +             */
> >>> +            if (!s->mach_use_reliable_get_clock ||
> >>> +                !s->src_use_reliable_get_clock) {
> >>> +                time_at_migration = kvmclock_current_nsec(s);
> >>
> >> ... and here, so that time_at_migration is not needed anymore.
> > 
> > Same as above.
> 
> You're right.
> 
> >> Also here it's enough to look at s->src_user_reliable_get_clock, because
> >> if s->mach_use_reliable_get_clock is false,
> >> s->src_use_reliable_get_clock will be false as well.
> > 
> > Yes, but i like the code annotation.
> 
> Ah, I think we're looking at it differently.

Well, i didnt want to mix the meaning of the variables:

+    /* whether machine supports reliable KVM_GET_CLOCK */
+    bool mach_use_reliable_get_clock;
+
+    /* whether source host supported reliable KVM_GET_CLOCK */
+    bool src_use_reliable_get_clock;

See the comments on top (later if you look at the variable, 
then have to think: well it has one name, but its disabled 
by that other path as well, so its more than its 
name,etc...).

> I'm thinking "mach_use_reliable_get_clock is just for migration,

Thats whether the machine supports it. New machines have it enabled,
olders don't.
> src_use_reliable_get_clock is the state". 

Thats whether the migration source supported it.

> Perhaps you're thinking of
> enabling/disabling the whole new code for old machines? 

source          destination         behaviour
supports        supports            on migration use s->clock,
                                    on stop/cont as well

supports        ~supports
                                    on migration use s->clock,
                                    on stop/cont read from guest mem

~support        supports            on migration read from guest,
                                    on stop/cont use
                                    kvm_get_clock/kvm_set_clock

~support        ~support            always read from guest memory


Thats what should happen (and thats what the patch should implement).


"support" means host supports your new KVM_GET_CLOCK/KVM_SET_CLOCK.

>  What is the
> advantage?

Well its necessary to use the correct thing, otherwise
you see a time backwards event.

> 
> >>> +            }
> >>> +        }
> >>>  
> >>> -        /* We can't rely on the migrated clock value, just discard it */
> >>> +        /* We can't rely on the saved clock value, just discard it */
> >>>          if (time_at_migration) {
> >>>              s->clock = time_at_migration;
> >>
> >> [...]
> >>
> >>>
> >>> +static bool kvmclock_src_use_reliable_get_clock(void *opaque)
> >>> +{
> >>> +    KVMClockState *s = opaque;
> >>> +
> >>> +    /*
> >>> +     * On machine types that support reliable KVM_GET_CLOCK,
> >>> +     * if host kernel does provide reliable KVM_GET_CLOCK,
> >>> +     * set src_use_reliable_get_clock=true so that destination
> >>> +     * avoids reading kvmclock from memory.
> >>> +     */
> >>> +    if (s->mach_use_reliable_get_clock && kvm_has_adjust_clock_stable()) {
> >>> +        s->src_use_reliable_get_clock = true;
> >>> +    }
> >>> +
> >>> +    return s->src_use_reliable_get_clock;
> >>> +}
> >>
> >> Here you can just return s->mach_use_reliable_get_clock. 
> > 
> > mach_use_reliable_get_clock can be true but host might not support it.
> 
> Yes, but the "needed" function is only required to avoid breaking
> pc-i440fx-2.7 and earlier. 

"needed" is required so that the migration between:

SRC             DEST                BEHAVIOUR
~support        supports            on migration read from guest,
                                    on stop/cont use
                                    kvm_get_clock/kvm_set_clock

Destination does not use KVM_GET_CLOCK value (which is
broken and should not be used).

> If you return true here, you can still
> migrate a "false" value for src_use_reliable_get_clock.

But the source only uses a reliable KVM_GET_CLOCK if 
both conditions are true.

And the subsection is only needed if the source
uses a reliable KVM_GET_CLOCK.

> >>  To set
> >> s->src_use_reliable_get_clock, after issuing KVM_GET_CLOCK you can look
> >> at the KVM_CLOCK_TSC_STABLE bit in the kvm_clock struct's flags.
> > 
> > KVM_CLOCK_TSC_STABLE bit in the kvmclock structure != 
> > KVM_GET_CLOCK returns reliable value, right?
> 
> It is the same as "is using masterclock", which is actually a stricter
> condition than the KVM_CHECK_EXTENSION return value.  The right check to
> use is whether masterclock is in use, 

Actually its "has a reliable KVM_GET_CLOCK" (which returns 
get_kernel_clock() + (rdtsc() - tsc_timestamp), 

"broken KVM_GET_CLOCK" =  get_kernel_clock()

> and then the idea is to treat
> clock,src_use_reliable_get_clock as one tuple that is updated atomically.
> 
> Paolo

Hum, not sure i get this...
Paolo Bonzini Nov. 14, 2016, 3 p.m. UTC | #7
On 14/11/2016 15:50, Marcelo Tosatti wrote:
> Well, i didnt want to mix the meaning of the variables:
> 
> +    /* whether machine supports reliable KVM_GET_CLOCK */
> +    bool mach_use_reliable_get_clock;
> +
> +    /* whether source host supported reliable KVM_GET_CLOCK */
> +    bool src_use_reliable_get_clock;
> 
> See the comments on top (later if you look at the variable, 
> then have to think: well it has one name, but its disabled 
> by that other path as well, so its more than its 
> name,etc...).
> 
>> I'm thinking "mach_use_reliable_get_clock is just for migration,
> 
> Thats whether the machine supports it. New machines have it enabled,
> olders don't.

Yes.

>> src_use_reliable_get_clock is the state". 
> 
> Thats whether the migration source supported it.

But it's not used only for migration.  It's used on every vmstate change
(running->stop and stop->running, isn't it?  I think that, apart from
the migration case, it's better to use s->clock if kvmclock is stable,
even on older machine types.

>>>>> +static bool kvmclock_src_use_reliable_get_clock(void *opaque)
>>>>> +{
>>>>> +    KVMClockState *s = opaque;
>>>>> +
>>>>> +    /*
>>>>> +     * On machine types that support reliable KVM_GET_CLOCK,
>>>>> +     * if host kernel does provide reliable KVM_GET_CLOCK,
>>>>> +     * set src_use_reliable_get_clock=true so that destination
>>>>> +     * avoids reading kvmclock from memory.
>>>>> +     */
>>>>> +    if (s->mach_use_reliable_get_clock && kvm_has_adjust_clock_stable()) {
>>>>> +        s->src_use_reliable_get_clock = true;
>>>>> +    }
>>>>> +
>>>>> +    return s->src_use_reliable_get_clock;
>>>>> +}
>>>>
>>>> Here you can just return s->mach_use_reliable_get_clock. 
>>>
>>> mach_use_reliable_get_clock can be true but host might not support it.
>>
>> Yes, but the "needed" function is only required to avoid breaking
>> pc-i440fx-2.7 and earlier. 
> 
> "needed" is required so that the migration between:
> 
> SRC             DEST                BEHAVIOUR
> ~support        supports            on migration read from guest,
>                                     on stop/cont use
>                                     kvm_get_clock/kvm_set_clock
> 
> Destination does not use KVM_GET_CLOCK value (which is
> broken and should not be used).

If needed returns false, the destination will see
src_use_reliable_get_clock = false anyway.

If needed returns true, the destination can still see
src_use_reliable_get_clock = false if that's the value.

needed	src_use_reliable_get_clock	effect
false	false				use kvmclock_current_nsec
false	true				use kvmclock_current_nsec
true	false				use kvmclock_current_nsec
true	true				use s->clock


So the idea is:

- set s->clock and s->reliable_get_clock on every KVM_GET_CLOCK

- on migration source, use subsections and
s->mach_use_reliable_get_clock to avoid breaking migration on pre-2.8
machine types

- on migration destination, use .pre_load so that s->reliable_get_clock
is initialized to false on older machine types

>> If you return true here, you can still
>> migrate a "false" value for src_use_reliable_get_clock.
> 
> But the source only uses a reliable KVM_GET_CLOCK if 
> both conditions are true.
> 
> And the subsection is only needed if the source
> uses a reliable KVM_GET_CLOCK.

Yes, but the point of the subsection is just to avoid breaking migration
format.

>> It is the same as "is using masterclock", which is actually a stricter
>> condition than the KVM_CHECK_EXTENSION return value.  The right check to
>> use is whether masterclock is in use, 
> 
> Actually its "has a reliable KVM_GET_CLOCK" (which returns 
> get_kernel_clock() + (rdtsc() - tsc_timestamp), 
> 
> "broken KVM_GET_CLOCK" =  get_kernel_clock()

Yes.  And these end up being the same.

Paolo
Marcelo Tosatti Nov. 14, 2016, 3:37 p.m. UTC | #8
On Mon, Nov 14, 2016 at 03:09:46PM +0100, Juan Quintela wrote:
> Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which
> > indicates that KVM_GET_CLOCK returns a value as seen by the guest at
> > that moment.
> >
> > For new machine types, use this value rather than reading 
> > from guest memory.
> >
> > This reduces kvmclock difference on migration from 5s to 0.1s
> > (when max_downtime == 5s).
> >
> > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> Acked-by: Juan Quintela <quintela@redhat.com>
> 
> But, if you have to respin it ....
> 
> 
> > +    /* whether machine supports reliable KVM_GET_CLOCK */
> > +    bool mach_use_reliable_get_clock;
> > +
> > +    /* whether source host supported reliable KVM_GET_CLOCK */
> > +    bool src_use_reliable_get_clock;
> 
> This two names are really long, but I don't have better suggesitons :-()
> >      if (running) {
> >          struct kvm_clock_data data = {};
> > -        uint64_t time_at_migration = kvmclock_current_nsec(s);
> > +        uint64_t time_at_migration = 0;
> 
> This was not "time_at_migration", it was not already before, but just
> now it looks really weird. (as it was already faulty, this is why it is
> only a suggestion.)
> 
> >  
> > -        s->clock_valid = false;
> > +        /* local (running VM) restore */
> > +        if (s->clock_valid) {
> > +            /*
> > +             * if host does not support reliable KVM_GET_CLOCK,
> > +             * read kvmclock value from memory
> > +             */
> > +            if (!kvm_has_adjust_clock_stable()) {
> > +                time_at_migration = kvmclock_current_nsec(s);
> > +            }
> > +        /* migration/savevm/init restore */
> > +        } else {
> > +            /*
> > +             * use s->clock in case machine uses reliable
> > +             * get clock and host where vm was executing
> > +             * supported reliable get clock
> > +             */
> 
> This comment is just weird.  Simplifying
>     /* If A and B do C */
>     if (!A and || !B) {
>        then D();
>     }
> 
> Doing the opposite comment?
> 
> Migration code looks rigth.

Fixed those.

> Once said that, I continue hating clocks.

Me too, especially the biological one.
diff mbox

Patch

Index: qemu-mig-advance-clock/hw/i386/kvm/clock.c
===================================================================
--- qemu-mig-advance-clock.orig/hw/i386/kvm/clock.c	2016-11-14 09:07:55.519856329 -0200
+++ qemu-mig-advance-clock/hw/i386/kvm/clock.c	2016-11-14 10:19:45.723254737 -0200
@@ -36,6 +36,12 @@ 
 
     uint64_t clock;
     bool clock_valid;
+
+    /* whether machine supports reliable KVM_GET_CLOCK */
+    bool mach_use_reliable_get_clock;
+
+    /* whether source host supported reliable KVM_GET_CLOCK */
+    bool src_use_reliable_get_clock;
 } KVMClockState;
 
 struct pvclock_vcpu_time_info {
@@ -91,15 +97,37 @@ 
 
     if (running) {
         struct kvm_clock_data data = {};
-        uint64_t time_at_migration = kvmclock_current_nsec(s);
+        uint64_t time_at_migration = 0;
 
-        s->clock_valid = false;
+        /* local (running VM) restore */
+        if (s->clock_valid) {
+            /*
+             * if host does not support reliable KVM_GET_CLOCK,
+             * read kvmclock value from memory
+             */
+            if (!kvm_has_adjust_clock_stable()) {
+                time_at_migration = kvmclock_current_nsec(s);
+            }
+        /* migration/savevm/init restore */
+        } else {
+            /*
+             * use s->clock in case machine uses reliable
+             * get clock and host where vm was executing
+             * supported reliable get clock
+             */
+            if (!s->mach_use_reliable_get_clock ||
+                !s->src_use_reliable_get_clock) {
+                time_at_migration = kvmclock_current_nsec(s);
+            }
+        }
 
-        /* We can't rely on the migrated clock value, just discard it */
+        /* We can't rely on the saved clock value, just discard it */
         if (time_at_migration) {
             s->clock = time_at_migration;
         }
 
+        s->clock_valid = false;
+
         data.clock = s->clock;
         ret = kvm_vm_ioctl(kvm_state, KVM_SET_CLOCK, &data);
         if (ret < 0) {
@@ -152,22 +180,76 @@ 
     qemu_add_vm_change_state_handler(kvmclock_vm_state_change, s);
 }
 
+static bool kvmclock_src_use_reliable_get_clock(void *opaque)
+{
+    KVMClockState *s = opaque;
+
+    /*
+     * On machine types that support reliable KVM_GET_CLOCK,
+     * if host kernel does provide reliable KVM_GET_CLOCK,
+     * set src_use_reliable_get_clock=true so that destination
+     * avoids reading kvmclock from memory.
+     */
+    if (s->mach_use_reliable_get_clock && kvm_has_adjust_clock_stable()) {
+        s->src_use_reliable_get_clock = true;
+    }
+
+    return s->src_use_reliable_get_clock;
+}
+
+static const VMStateDescription kvmclock_reliable_get_clock = {
+    .name = "kvmclock/src_use_reliable_get_clock",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = kvmclock_src_use_reliable_get_clock,
+    .fields = (VMStateField[]) {
+        VMSTATE_BOOL(src_use_reliable_get_clock, KVMClockState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static void kvmclock_pre_save(void *opaque)
+{
+    KVMClockState *s = opaque;
+    struct kvm_clock_data data;
+    int ret;
+
+    ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data);
+    if (ret < 0) {
+        fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret));
+                abort();
+    }
+    s->clock = data.clock;
+}
+
 static const VMStateDescription kvmclock_vmsd = {
     .name = "kvmclock",
     .version_id = 1,
     .minimum_version_id = 1,
+    .pre_save = kvmclock_pre_save,
     .fields = (VMStateField[]) {
         VMSTATE_UINT64(clock, KVMClockState),
         VMSTATE_END_OF_LIST()
+    },
+    .subsections = (const VMStateDescription * []) {
+        &kvmclock_reliable_get_clock,
+        NULL
     }
 };
 
+static Property kvmclock_properties[] = {
+    DEFINE_PROP_BOOL("mach_use_reliable_get_clock", KVMClockState,
+                      mach_use_reliable_get_clock, true),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
 static void kvmclock_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
 
     dc->realize = kvmclock_realize;
     dc->vmsd = &kvmclock_vmsd;
+    dc->props = kvmclock_properties;
 }
 
 static const TypeInfo kvmclock_info = {
Index: qemu-mig-advance-clock/include/hw/i386/pc.h
===================================================================
--- qemu-mig-advance-clock.orig/include/hw/i386/pc.h	2016-11-14 09:07:55.519856329 -0200
+++ qemu-mig-advance-clock/include/hw/i386/pc.h	2016-11-14 09:11:47.112200123 -0200
@@ -370,6 +370,11 @@ 
 #define PC_COMPAT_2_7 \
     HW_COMPAT_2_7 \
     {\
+        .driver   = "kvmclock",\
+        .property = "mach_use_reliable_get_clock",\
+        .value    = "off",\
+    },\
+    {\
         .driver   = TYPE_X86_CPU,\
         .property = "l3-cache",\
         .value    = "off",\
Index: qemu-mig-advance-clock/target-i386/kvm.c
===================================================================
--- qemu-mig-advance-clock.orig/target-i386/kvm.c	2016-11-14 09:07:55.520856330 -0200
+++ qemu-mig-advance-clock/target-i386/kvm.c	2016-11-14 09:11:47.125200142 -0200
@@ -117,6 +117,13 @@ 
     return kvm_check_extension(kvm_state, KVM_CAP_X86_SMM);
 }
 
+bool kvm_has_adjust_clock_stable(void)
+{
+    int ret = kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK);
+
+    return (ret == KVM_CLOCK_TSC_STABLE);
+}
+
 bool kvm_allows_irq0_override(void)
 {
     return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
Index: qemu-mig-advance-clock/target-i386/kvm_i386.h
===================================================================
--- qemu-mig-advance-clock.orig/target-i386/kvm_i386.h	2016-11-14 09:07:55.520856330 -0200
+++ qemu-mig-advance-clock/target-i386/kvm_i386.h	2016-11-14 09:11:47.125200142 -0200
@@ -17,6 +17,7 @@ 
 
 bool kvm_allows_irq0_override(void);
 bool kvm_has_smm(void);
+bool kvm_has_adjust_clock_stable(void);
 void kvm_synchronize_all_tsc(void);
 void kvm_arch_reset_vcpu(X86CPU *cs);
 void kvm_arch_do_init_vcpu(X86CPU *cs);