mbox series

[0/4] multifd: various fixes

Message ID 20230922065625.21848-1-elena.ufimtseva@oracle.com (mailing list archive)
Headers show
Series multifd: various fixes | expand

Message

Elena Ufimtseva Sept. 22, 2023, 6:56 a.m. UTC
Hello

While working and testing various live migration scenarios,
a few issues were found.

This is my first patches in live migration and I will
appreciate the suggestions from the community if these
patches could be done differently.

[PATCH 1/4] multifd: wait for channels_ready before sending sync
I am not certain about this change since it seems that
the sync flag could be the part of the packets with pages that are
being sent out currently.
But the traces show this is not always the case:
multifd_send 230.873 pid=55477 id=0x0 packet_num=0x6f4 normal=0x40 flags=0x1 next_packet_size=0x40000
multifd_send 14.718 pid=55477 id=0x1 packet_num=0x6f5 normal=0x0 flags=0x1 next_packet_size=0x80000
If the sync packet is indeed can be a standalone one, then waiting for
channels_ready before seem to be appropriate, but waisting iteration on
sync only packet.
[PATCH 4/4] is also relevant to 1/4, but fixes the over-accounting in
case of sync only packet.


Thank you in advance and looking forward for your feedback.

Elena

Elena Ufimtseva (4):
  multifd: wait for channels_ready before sending sync
  migration: check for rate_limit_max for RATE_LIMIT_DISABLED
  multifd: fix counters in multifd_send_thread
  multifd: reset next_packet_len after sending pages

 migration/migration-stats.c |  8 ++++----
 migration/multifd.c         | 11 ++++++-----
 2 files changed, 10 insertions(+), 9 deletions(-)

Comments

Fabiano Rosas Sept. 22, 2023, 2:18 p.m. UTC | #1
Elena Ufimtseva <elena.ufimtseva@oracle.com> writes:

> Hello
>
> While working and testing various live migration scenarios,
> a few issues were found.
>
> This is my first patches in live migration and I will
> appreciate the suggestions from the community if these
> patches could be done differently.
>
> [PATCH 1/4] multifd: wait for channels_ready before sending sync
> I am not certain about this change since it seems that
> the sync flag could be the part of the packets with pages that are
> being sent out currently.
> But the traces show this is not always the case:
> multifd_send 230.873 pid=55477 id=0x0 packet_num=0x6f4 normal=0x40 flags=0x1 next_packet_size=0x40000
> multifd_send 14.718 pid=55477 id=0x1 packet_num=0x6f5 normal=0x0 flags=0x1 next_packet_size=0x80000
> If the sync packet is indeed can be a standalone one, then waiting for
> channels_ready before seem to be appropriate, but waisting iteration on
> sync only packet.

I haven't looked at this code for a while, so there's some context
switching to be made, but you're definitely on the right track here. I
actually have an unsent patch doing almost the same as your patch
1/4. I'll comment more there.

About the sync being standalone, I would expect that to always be the
case since we're incrementing packet_num at that point.

> [PATCH 4/4] is also relevant to 1/4, but fixes the over-accounting in
> case of sync only packet.
>
>
> Thank you in advance and looking forward for your feedback.
>
> Elena
>
> Elena Ufimtseva (4):
>   multifd: wait for channels_ready before sending sync
>   migration: check for rate_limit_max for RATE_LIMIT_DISABLED
>   multifd: fix counters in multifd_send_thread
>   multifd: reset next_packet_len after sending pages
>
>  migration/migration-stats.c |  8 ++++----
>  migration/multifd.c         | 11 ++++++-----
>  2 files changed, 10 insertions(+), 9 deletions(-)