diff mbox series

[01/17] replay: Fix migration use of clock for statistics

Message ID 20241220104220.2007786-2-npiggin@gmail.com (mailing list archive)
State New, archived
Headers show
Series replay: Fixes and avocado test updates | expand

Commit Message

Nicholas Piggin Dec. 20, 2024, 10:42 a.m. UTC
Migration reads CLOCK_HOST when not holding the replay_mutex, which
asserts when recording a trace. These are not guest visible so should
be CLOCK_REALTIME like other statistics in MigrationState, which do
not require the replay_mutex.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 migration/migration.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Peter Xu Dec. 20, 2024, 4:31 p.m. UTC | #1
On Fri, Dec 20, 2024 at 08:42:03PM +1000, Nicholas Piggin wrote:
> Migration reads CLOCK_HOST when not holding the replay_mutex, which
> asserts when recording a trace. These are not guest visible so should
> be CLOCK_REALTIME like other statistics in MigrationState, which do
> not require the replay_mutex.

Irrelevant of the change, should we document such lock implications in
timer.h?

> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  migration/migration.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 8c5bd0a75c8..2eb9e50a263 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3433,7 +3433,7 @@ static void *migration_thread(void *opaque)
>  {
>      MigrationState *s = opaque;
>      MigrationThread *thread = NULL;
> -    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> +    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>      MigThrError thr_error;
>      bool urgent = false;
>      Error *local_err = NULL;
> @@ -3504,7 +3504,7 @@ static void *migration_thread(void *opaque)
>          goto out;
>      }
>  
> -    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> +    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
>  
>      trace_migration_thread_setup_complete();
>  
> @@ -3584,7 +3584,7 @@ static void *bg_migration_thread(void *opaque)
>  
>      migration_rate_set(RATE_LIMIT_DISABLED);
>  
> -    setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> +    setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>      /*
>       * We want to save vmstate for the moment when migration has been
>       * initiated but also we want to save RAM content while VM is running.
> @@ -3629,7 +3629,7 @@ static void *bg_migration_thread(void *opaque)
>          goto fail_setup;
>      }
>  
> -    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> +    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
>  
>      trace_migration_thread_setup_complete();
>  
> -- 
> 2.45.2
>
Nicholas Piggin Dec. 21, 2024, 3:02 a.m. UTC | #2
On Sat Dec 21, 2024 at 2:31 AM AEST, Peter Xu wrote:
> On Fri, Dec 20, 2024 at 08:42:03PM +1000, Nicholas Piggin wrote:
> > Migration reads CLOCK_HOST when not holding the replay_mutex, which
> > asserts when recording a trace. These are not guest visible so should
> > be CLOCK_REALTIME like other statistics in MigrationState, which do
> > not require the replay_mutex.
>
> Irrelevant of the change, should we document such lock implications in
> timer.h?

I guess the intention was to try to avoid caller caring too much
about replay internals, so I'm not sure if that will help or
hinder understanding :(

I think the big rule is something like "if it affects guest state,
then you must use HOST or VIRTUAL*, if it does not affect guest state
then you must use REALTIME". record-replay code then takes care of
replay mutex locking.

Does get a little fuzzy around edges in code that is somewhat
aware of record-replay though, like migration/snapshots.

(Pavel please correct me if I've been saying the wrong things)

Thanks,
Nick

>
> > 
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> >  migration/migration.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 8c5bd0a75c8..2eb9e50a263 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -3433,7 +3433,7 @@ static void *migration_thread(void *opaque)
> >  {
> >      MigrationState *s = opaque;
> >      MigrationThread *thread = NULL;
> > -    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> > +    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> >      MigThrError thr_error;
> >      bool urgent = false;
> >      Error *local_err = NULL;
> > @@ -3504,7 +3504,7 @@ static void *migration_thread(void *opaque)
> >          goto out;
> >      }
> >  
> > -    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> > +    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
> >  
> >      trace_migration_thread_setup_complete();
> >  
> > @@ -3584,7 +3584,7 @@ static void *bg_migration_thread(void *opaque)
> >  
> >      migration_rate_set(RATE_LIMIT_DISABLED);
> >  
> > -    setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> > +    setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> >      /*
> >       * We want to save vmstate for the moment when migration has been
> >       * initiated but also we want to save RAM content while VM is running.
> > @@ -3629,7 +3629,7 @@ static void *bg_migration_thread(void *opaque)
> >          goto fail_setup;
> >      }
> >  
> > -    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> > +    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
> >  
> >      trace_migration_thread_setup_complete();
> >  
> > -- 
> > 2.45.2
> >
Peter Xu Dec. 23, 2024, 5:26 p.m. UTC | #3
On Sat, Dec 21, 2024 at 01:02:01PM +1000, Nicholas Piggin wrote:
> On Sat Dec 21, 2024 at 2:31 AM AEST, Peter Xu wrote:
> > On Fri, Dec 20, 2024 at 08:42:03PM +1000, Nicholas Piggin wrote:
> > > Migration reads CLOCK_HOST when not holding the replay_mutex, which
> > > asserts when recording a trace. These are not guest visible so should
> > > be CLOCK_REALTIME like other statistics in MigrationState, which do
> > > not require the replay_mutex.
> >
> > Irrelevant of the change, should we document such lock implications in
> > timer.h?
> 
> I guess the intention was to try to avoid caller caring too much
> about replay internals, so I'm not sure if that will help or
> hinder understanding :(

CLOCK_HOST should be the wall clock in QEMU, IIUC.  If any QEMU caller
tries to read host wall clock requires some mutex to be held.. then I don't
see how we can avoid mentioning it.  It's indeed weird if we need to take a
feature specific mutex just to read the wallclock.. But maybe I misread the
context somewhere..

> 
> I think the big rule is something like "if it affects guest state,
> then you must use HOST or VIRTUAL*, if it does not affect guest state

HOST clock logically shouldn't be relevant to guest-state?

> then you must use REALTIME". record-replay code then takes care of
> replay mutex locking.
> 
> Does get a little fuzzy around edges in code that is somewhat
> aware of record-replay though, like migration/snapshots.

Said that, I agree with the change itself - any measurement may not want to
involve NTP at all... which HOST / gtod will, but REALTIME won't.  However
this patch doesn't seem to be for that purpose..  So I'd like to double
check.

Thanks,
Pavel Dovgalyuk Dec. 24, 2024, 7:24 a.m. UTC | #4
On 23.12.2024 20:26, Peter Xu wrote:
> On Sat, Dec 21, 2024 at 01:02:01PM +1000, Nicholas Piggin wrote:
>> On Sat Dec 21, 2024 at 2:31 AM AEST, Peter Xu wrote:
>>> On Fri, Dec 20, 2024 at 08:42:03PM +1000, Nicholas Piggin wrote:
>>>> Migration reads CLOCK_HOST when not holding the replay_mutex, which
>>>> asserts when recording a trace. These are not guest visible so should
>>>> be CLOCK_REALTIME like other statistics in MigrationState, which do
>>>> not require the replay_mutex.
>>>
>>> Irrelevant of the change, should we document such lock implications in
>>> timer.h?
>>
>> I guess the intention was to try to avoid caller caring too much
>> about replay internals, so I'm not sure if that will help or
>> hinder understanding :(
> 
> CLOCK_HOST should be the wall clock in QEMU, IIUC.  If any QEMU caller
> tries to read host wall clock requires some mutex to be held.. then I don't
> see how we can avoid mentioning it.  It's indeed weird if we need to take a
> feature specific mutex just to read the wallclock.. But maybe I misread the
> context somewhere..
> 
>>
>> I think the big rule is something like "if it affects guest state,
>> then you must use HOST or VIRTUAL*, if it does not affect guest state
> 
> HOST clock logically shouldn't be relevant to guest-state?

CLOCK_HOST is used for rtc by default. As the rtc affects the guest 
state, therefore CLOCK_HOST affects guest state too.

Migration is not related to guest state change, therefore it should 
either use realtime clock, or set some flag to make host clock reads not 
tracked by record/replay.

> 
>> then you must use REALTIME". record-replay code then takes care of
>> replay mutex locking.
>>
>> Does get a little fuzzy around edges in code that is somewhat
>> aware of record-replay though, like migration/snapshots.
> 
> Said that, I agree with the change itself - any measurement may not want to
> involve NTP at all... which HOST / gtod will, but REALTIME won't.  However
> this patch doesn't seem to be for that purpose..  So I'd like to double
> check.
> 
> Thanks,
>
Peter Xu Dec. 24, 2024, 3:19 p.m. UTC | #5
On Tue, Dec 24, 2024 at 10:24:51AM +0300, Pavel Dovgalyuk wrote:
> On 23.12.2024 20:26, Peter Xu wrote:
> > On Sat, Dec 21, 2024 at 01:02:01PM +1000, Nicholas Piggin wrote:
> > > On Sat Dec 21, 2024 at 2:31 AM AEST, Peter Xu wrote:
> > > > On Fri, Dec 20, 2024 at 08:42:03PM +1000, Nicholas Piggin wrote:
> > > > > Migration reads CLOCK_HOST when not holding the replay_mutex, which
> > > > > asserts when recording a trace. These are not guest visible so should
> > > > > be CLOCK_REALTIME like other statistics in MigrationState, which do
> > > > > not require the replay_mutex.
> > > > 
> > > > Irrelevant of the change, should we document such lock implications in
> > > > timer.h?
> > > 
> > > I guess the intention was to try to avoid caller caring too much
> > > about replay internals, so I'm not sure if that will help or
> > > hinder understanding :(
> > 
> > CLOCK_HOST should be the wall clock in QEMU, IIUC.  If any QEMU caller
> > tries to read host wall clock requires some mutex to be held.. then I don't
> > see how we can avoid mentioning it.  It's indeed weird if we need to take a
> > feature specific mutex just to read the wallclock.. But maybe I misread the
> > context somewhere..
> > 
> > > 
> > > I think the big rule is something like "if it affects guest state,
> > > then you must use HOST or VIRTUAL*, if it does not affect guest state
> > 
> > HOST clock logically shouldn't be relevant to guest-state?
> 
> CLOCK_HOST is used for rtc by default. As the rtc affects the guest state,
> therefore CLOCK_HOST affects guest state too.

It's not obvious to me that HOST should only be used for rtc, and it's part
of guest state.  If that's a must, I'd still suggest we add that into doc.
But then it means we lose one way to fetch host wallclock in the time API;
I still see some other users use it, I'm guessing in the way to fetch host
wall clock.

> 
> Migration is not related to guest state change, therefore it should either
> use realtime clock, or set some flag to make host clock reads not tracked by
> record/replay.

In migration's case, realtime clock suites more.  But maybe we still need
another clock indeed just to fetch host wall clock without any lock
implications.  So maybe the better way is making the tracked one to be
CLOCK_GUEST_RTC, put rich documentatation to avoid abuse, then keep HOST
the simple definition.

Thanks,
diff mbox series

Patch

diff --git a/migration/migration.c b/migration/migration.c
index 8c5bd0a75c8..2eb9e50a263 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3433,7 +3433,7 @@  static void *migration_thread(void *opaque)
 {
     MigrationState *s = opaque;
     MigrationThread *thread = NULL;
-    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     MigThrError thr_error;
     bool urgent = false;
     Error *local_err = NULL;
@@ -3504,7 +3504,7 @@  static void *migration_thread(void *opaque)
         goto out;
     }
 
-    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
+    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
 
     trace_migration_thread_setup_complete();
 
@@ -3584,7 +3584,7 @@  static void *bg_migration_thread(void *opaque)
 
     migration_rate_set(RATE_LIMIT_DISABLED);
 
-    setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+    setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     /*
      * We want to save vmstate for the moment when migration has been
      * initiated but also we want to save RAM content while VM is running.
@@ -3629,7 +3629,7 @@  static void *bg_migration_thread(void *opaque)
         goto fail_setup;
     }
 
-    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
+    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
 
     trace_migration_thread_setup_complete();