mbox series

[0/7] migration: query-migrationthreads enhancements and cleanups

Message ID 20240930195837.825728-1-peterx@redhat.com (mailing list archive)
Headers show
Series migration: query-migrationthreads enhancements and cleanups | expand

Message

Peter Xu Sept. 30, 2024, 7:58 p.m. UTC
Prasad reported a misalignment issue with query-migrationthreads v.s. the
recently migration thread name changes.  So I prepared patch 1, which will
make the main thread on src be named the same way reported either in
pthread or QMP query-migrationthreads API.

Then I found quite something missing around query-migrationthreads.  E.g.,
it so far only accounts multifd sender threads, while it ignored too many
other kinds of migration threads (either postcopy ones, destination multifd
threads, temporary threads etc.).  It means even if an admin can get TIDs
on src QEMU and does pinning (assuming that was the goal of the original
API), it won't be able to do the same for dest QEMUs, which seems to lose
its purpose.

HMP is also missing, I added it too, as thread IDs can definitely be good
candidates during debugging.  If we have QMP ready, it sounds like it
should naturally fit the HMP one too.

I did some cleanups here and there all around.  Feel free to have a look,
thanks.

CI: https://gitlab.com/peterx/qemu/-/pipelines/1475958754

Peter Xu (7):
  migration: Unify names of migration src main thread
  migration: Put thread names together with macros
  migration: Remove thread_id in migration_threads_add()
  migration: Simplify migration-threads API
  migration: Add all threads with QMP query-migrationthreads
  migration: Remove MigrationThread and threadinfo.h
  hmp: Add "info migrationthreads"

 include/monitor/hmp.h          |  1 +
 migration/migration.h          | 17 ++++++++++
 migration/threadinfo.h         | 25 --------------
 migration/colo.c               | 10 ++++--
 migration/dirtyrate.c          | 13 ++++++--
 migration/migration-hmp-cmds.c | 25 ++++++++++++++
 migration/migration.c          | 19 ++++++-----
 migration/multifd.c            | 18 ++++++----
 migration/postcopy-ram.c       | 16 ++++++---
 migration/savevm.c             | 13 +++++---
 migration/threadinfo.c         | 61 ++++++++++++++++++++--------------
 hmp-commands-info.hx           | 13 ++++++++
 12 files changed, 150 insertions(+), 81 deletions(-)
 delete mode 100644 migration/threadinfo.h

Comments

Markus Armbruster Oct. 1, 2024, 5:46 a.m. UTC | #1
Command query-migrationthreads went in without a QAPI ACK.  Issues
review should have caught:

* Flawed documentation.  Fixed in commit e6c60bf02d1.

* It should have been spelled query-migration-threads.  Not worth fixing
  now, I guess.

* What are the use cases?  The commit message doesn't tell!  If it's
  just for debugging, the command should be marked unstable.
Daniel P. Berrangé Oct. 1, 2024, 2:25 p.m. UTC | #2
On Tue, Oct 01, 2024 at 07:46:09AM +0200, Markus Armbruster wrote:
> Command query-migrationthreads went in without a QAPI ACK.  Issues
> review should have caught:
> 
> * Flawed documentation.  Fixed in commit e6c60bf02d1.
> 
> * It should have been spelled query-migration-threads.  Not worth fixing
>   now, I guess.
> 
> * What are the use cases?  The commit message doesn't tell!  If it's
>   just for debugging, the command should be marked unstable.

It is hard to use too.

Lets say a mgmt app wants to restrict migration threads to some
certain pCPUs. It can't call query-migrationthreads beforehand
as the threads don't exist until migration is started. If it
calls after migration is started, then there's a window where
threads are running on arbitrary pCPUs that QEMU has access
to. There's no synchronization point where threads have been
created & can be queried, but are not yet sending data (and
thus burning CPU time)


With regards,
Daniel
Peter Xu Oct. 1, 2024, 3:06 p.m. UTC | #3
On Tue, Oct 01, 2024 at 03:25:14PM +0100, Daniel P. Berrangé wrote:
> On Tue, Oct 01, 2024 at 07:46:09AM +0200, Markus Armbruster wrote:
> > Command query-migrationthreads went in without a QAPI ACK.  Issues
> > review should have caught:
> > 
> > * Flawed documentation.  Fixed in commit e6c60bf02d1.
> > 
> > * It should have been spelled query-migration-threads.  Not worth fixing
> >   now, I guess.
> > 
> > * What are the use cases?  The commit message doesn't tell!  If it's
> >   just for debugging, the command should be marked unstable.
> 
> It is hard to use too.
> 
> Lets say a mgmt app wants to restrict migration threads to some
> certain pCPUs. It can't call query-migrationthreads beforehand
> as the threads don't exist until migration is started. If it
> calls after migration is started, then there's a window where
> threads are running on arbitrary pCPUs that QEMU has access
> to. There's no synchronization point where threads have been
> created & can be queried, but are not yet sending data (and
> thus burning CPU time)

Indeed, I suppose tricks needed if to work with such model, e.g., mgmt
needs to turn bw=0, start migration, query TIDs, then restore bw.

However that still lacks at least the dest multifd threads, as currently it
only reports src multifd threads TIDs.  I don't see why a serious mgmt
would like to pin and care only src threads, not dest threads, which can
also eat as much (or even more) pCPU resources.

For real debugging purpose, I actually don't see a major value out of it
either, because GDB can provide all information that this API wants to
provide, and only better with thread stacks if we want.

Since I don't see how this can be used right, it didn't get proper QAPI
reviews, and further I highly suspect whether this API is consumed by
anyone at all.. in any serious way.  Shall we remove this API (with/without
going through the deprecation process)?

I added the author Jiacheng too.

Thanks,
Markus Armbruster Oct. 2, 2024, 5:58 a.m. UTC | #4
Peter Xu <peterx@redhat.com> writes:

> On Tue, Oct 01, 2024 at 03:25:14PM +0100, Daniel P. Berrangé wrote:
>> On Tue, Oct 01, 2024 at 07:46:09AM +0200, Markus Armbruster wrote:
>> > Command query-migrationthreads went in without a QAPI ACK.  Issues
>> > review should have caught:
>> > 
>> > * Flawed documentation.  Fixed in commit e6c60bf02d1.
>> > 
>> > * It should have been spelled query-migration-threads.  Not worth fixing
>> >   now, I guess.
>> > 
>> > * What are the use cases?  The commit message doesn't tell!  If it's
>> >   just for debugging, the command should be marked unstable.
>> 
>> It is hard to use too.
>> 
>> Lets say a mgmt app wants to restrict migration threads to some
>> certain pCPUs. It can't call query-migrationthreads beforehand
>> as the threads don't exist until migration is started. If it
>> calls after migration is started, then there's a window where
>> threads are running on arbitrary pCPUs that QEMU has access
>> to. There's no synchronization point where threads have been
>> created & can be queried, but are not yet sending data (and
>> thus burning CPU time)
>
> Indeed, I suppose tricks needed if to work with such model, e.g., mgmt
> needs to turn bw=0, start migration, query TIDs, then restore bw.
>
> However that still lacks at least the dest multifd threads, as currently it
> only reports src multifd threads TIDs.  I don't see why a serious mgmt
> would like to pin and care only src threads, not dest threads, which can
> also eat as much (or even more) pCPU resources.

Sounds like there's a use case for management applications querying
TIDs, but query-migrationthreads falls short of serving it.

> For real debugging purpose, I actually don't see a major value out of it
> either, because GDB can provide all information that this API wants to
> provide, and only better with thread stacks if we want.

True.

> Since I don't see how this can be used right, it didn't get proper QAPI
> reviews, and further I highly suspect whether this API is consumed by
> anyone at all.. in any serious way.  Shall we remove this API (with/without
> going through the deprecation process)?

If we decide we want to serve the management application use case now,
we should provide a suitable interface, then deprecate
query-migrationthreads.

If we decide not now or not at all, we can deprecate it right away.
Removal without deprecation is also possible, but I doubt breaking our
compatibility promise is justified.

> I added the author Jiacheng too.

Users of query-migrationthreads, please speak up!
Peter Xu Oct. 10, 2024, 7:46 p.m. UTC | #5
On Wed, Oct 02, 2024 at 07:58:48AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Tue, Oct 01, 2024 at 03:25:14PM +0100, Daniel P. Berrangé wrote:
> >> On Tue, Oct 01, 2024 at 07:46:09AM +0200, Markus Armbruster wrote:
> >> > Command query-migrationthreads went in without a QAPI ACK.  Issues
> >> > review should have caught:
> >> > 
> >> > * Flawed documentation.  Fixed in commit e6c60bf02d1.
> >> > 
> >> > * It should have been spelled query-migration-threads.  Not worth fixing
> >> >   now, I guess.
> >> > 
> >> > * What are the use cases?  The commit message doesn't tell!  If it's
> >> >   just for debugging, the command should be marked unstable.
> >> 
> >> It is hard to use too.
> >> 
> >> Lets say a mgmt app wants to restrict migration threads to some
> >> certain pCPUs. It can't call query-migrationthreads beforehand
> >> as the threads don't exist until migration is started. If it
> >> calls after migration is started, then there's a window where
> >> threads are running on arbitrary pCPUs that QEMU has access
> >> to. There's no synchronization point where threads have been
> >> created & can be queried, but are not yet sending data (and
> >> thus burning CPU time)
> >
> > Indeed, I suppose tricks needed if to work with such model, e.g., mgmt
> > needs to turn bw=0, start migration, query TIDs, then restore bw.
> >
> > However that still lacks at least the dest multifd threads, as currently it
> > only reports src multifd threads TIDs.  I don't see why a serious mgmt
> > would like to pin and care only src threads, not dest threads, which can
> > also eat as much (or even more) pCPU resources.
> 
> Sounds like there's a use case for management applications querying
> TIDs, but query-migrationthreads falls short of serving it.
> 
> > For real debugging purpose, I actually don't see a major value out of it
> > either, because GDB can provide all information that this API wants to
> > provide, and only better with thread stacks if we want.
> 
> True.
> 
> > Since I don't see how this can be used right, it didn't get proper QAPI
> > reviews, and further I highly suspect whether this API is consumed by
> > anyone at all.. in any serious way.  Shall we remove this API (with/without
> > going through the deprecation process)?
> 
> If we decide we want to serve the management application use case now,
> we should provide a suitable interface, then deprecate
> query-migrationthreads.
> 
> If we decide not now or not at all, we can deprecate it right away.
> Removal without deprecation is also possible, but I doubt breaking our
> compatibility promise is justified.
> 
> > I added the author Jiacheng too.
> 
> Users of query-migrationthreads, please speak up!

I'll go ahead and remove it.  The current plan is I'll skip the deprecation
procedure for this one because I don't expect anyone would read the
deprecation notification at all... aka, no real user I can ever think of,
who only cares about source pinning not dest.

I'll pick patch 2 out and send separately, which is still a cleanup without
the query-migrationthreads API.

We can keep the discussion going in the new patch.

Thanks,