Message ID | 20240226195654.934709-7-hao.xiang@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Introduce multifd zero page checking. | expand |
Hao Xiang <hao.xiang@bytedance.com> writes: > This change extends the MigrationStatus interface to track zero pages > and zero bytes counter. > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> [...] > diff --git a/qapi/migration.json b/qapi/migration.json > index a0a85a0312..171734c07e 100644 > --- a/qapi/migration.json > +++ b/qapi/migration.json > @@ -63,6 +63,10 @@ > # between 0 and @dirty-sync-count * @multifd-channels. (since > # 7.1) > # > +# @zero-pages: number of zero pages (since 9.0) > +# > +# @zero-bytes: number of zero bytes sent (since 9.0) > +# Awfully terse. How are these two related? > # Features: > # > # @deprecated: Member @skipped is always zero since 1.5.3 [...]
On Wed, Feb 28, 2024 at 1:52 AM Markus Armbruster <armbru@redhat.com> wrote: > > Hao Xiang <hao.xiang@bytedance.com> writes: > > > This change extends the MigrationStatus interface to track zero pages > > and zero bytes counter. > > > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> > > [...] > > > diff --git a/qapi/migration.json b/qapi/migration.json > > index a0a85a0312..171734c07e 100644 > > --- a/qapi/migration.json > > +++ b/qapi/migration.json > > @@ -63,6 +63,10 @@ > > # between 0 and @dirty-sync-count * @multifd-channels. (since > > # 7.1) > > # > > +# @zero-pages: number of zero pages (since 9.0) > > +# > > +# @zero-bytes: number of zero bytes sent (since 9.0) > > +# > > Awfully terse. How are these two related? Sorry I forgot to address the same feedback from the last version. zero-pages are the number of pages being detected as all "zero" and hence the payload isn't sent over the network. zero-bytes is basically zero-pages * page_size. It's the number of bytes migrated (but not actually sent through the network) because they are all "zero". These two are related to the existing interface below. normal and normal-bytes are the same representation of pages who are not all "zero" and are actually sent through the network. # @normal: number of normal pages (since 1.2) # # @normal-bytes: number of normal bytes sent (since 1.2) > > > # Features: > > # > > # @deprecated: Member @skipped is always zero since 1.5.3 > > [...] >
Hao Xiang <hao.xiang@bytedance.com> writes: > On Wed, Feb 28, 2024 at 1:52 AM Markus Armbruster <armbru@redhat.com> wrote: >> >> Hao Xiang <hao.xiang@bytedance.com> writes: >> >> > This change extends the MigrationStatus interface to track zero pages >> > and zero bytes counter. >> > >> > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> >> >> [...] >> >> > diff --git a/qapi/migration.json b/qapi/migration.json >> > index a0a85a0312..171734c07e 100644 >> > --- a/qapi/migration.json >> > +++ b/qapi/migration.json >> > @@ -63,6 +63,10 @@ >> > # between 0 and @dirty-sync-count * @multifd-channels. (since >> > # 7.1) >> > # >> > +# @zero-pages: number of zero pages (since 9.0) >> > +# >> > +# @zero-bytes: number of zero bytes sent (since 9.0) >> > +# >> >> Awfully terse. How are these two related? > > Sorry I forgot to address the same feedback from the last version. Happens :) > zero-pages are the number of pages being detected as all "zero" and > hence the payload isn't sent over the network. zero-bytes is basically > zero-pages * page_size. It's the number of bytes migrated (but not > actually sent through the network) because they are all "zero". These > two are related to the existing interface below. normal and > normal-bytes are the same representation of pages who are not all > "zero" and are actually sent through the network. > > # @normal: number of normal pages (since 1.2) > # > # @normal-bytes: number of normal bytes sent (since 1.2) We also have # @duplicate: number of duplicate (zero) pages (since 1.2) # # @skipped: number of skipped zero pages. Always zero, only provided for # compatibility (since 1.5) Page skipping was introduced in 1.5, and withdrawn in 1.5.3 and 1.6. @skipped was formally deprecated in 8.1. It'll soon be gone, no need to worry about it now. That leaves three values related to pages sent: @normal (and @normal-bytes), @duplicate (but no @duplicate-bytes), and @zero-pages (and @zero-bytes). I unwittingly created a naming inconsistency between @normal, @duplicate, and @zero-pages when I asked you to rename @zero to @zero-pages. The meaning of the three values is not obvious, and the doc comments don't explain them. Can you, or anybody familiar with migration, explain them to me? MigrationStats return some values as bytes, some as pages, and some as both. I hate that. Can we standardize on bytes? >> >> > # Features: >> > # >> > # @deprecated: Member @skipped is always zero since 1.5.3 >> >> [...] >>
On Wed, Feb 28, 2024 at 10:01 PM Markus Armbruster <armbru@redhat.com> wrote: > > Hao Xiang <hao.xiang@bytedance.com> writes: > > > On Wed, Feb 28, 2024 at 1:52 AM Markus Armbruster <armbru@redhat.com> wrote: > >> > >> Hao Xiang <hao.xiang@bytedance.com> writes: > >> > >> > This change extends the MigrationStatus interface to track zero pages > >> > and zero bytes counter. > >> > > >> > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> > >> > >> [...] > >> > >> > diff --git a/qapi/migration.json b/qapi/migration.json > >> > index a0a85a0312..171734c07e 100644 > >> > --- a/qapi/migration.json > >> > +++ b/qapi/migration.json > >> > @@ -63,6 +63,10 @@ > >> > # between 0 and @dirty-sync-count * @multifd-channels. (since > >> > # 7.1) > >> > # > >> > +# @zero-pages: number of zero pages (since 9.0) > >> > +# > >> > +# @zero-bytes: number of zero bytes sent (since 9.0) > >> > +# > >> > >> Awfully terse. How are these two related? > > > > Sorry I forgot to address the same feedback from the last version. > > Happens :) > > > zero-pages are the number of pages being detected as all "zero" and > > hence the payload isn't sent over the network. zero-bytes is basically > > zero-pages * page_size. It's the number of bytes migrated (but not > > actually sent through the network) because they are all "zero". These > > two are related to the existing interface below. normal and > > normal-bytes are the same representation of pages who are not all > > "zero" and are actually sent through the network. > > > > # @normal: number of normal pages (since 1.2) > > # > > # @normal-bytes: number of normal bytes sent (since 1.2) > > We also have > > # @duplicate: number of duplicate (zero) pages (since 1.2) > # > # @skipped: number of skipped zero pages. Always zero, only provided for > # compatibility (since 1.5) > > Page skipping was introduced in 1.5, and withdrawn in 1.5.3 and 1.6. > @skipped was formally deprecated in 8.1. It'll soon be gone, no need to > worry about it now. > > That leaves three values related to pages sent: @normal (and > @normal-bytes), @duplicate (but no @duplicate-bytes), and @zero-pages > (and @zero-bytes). > > I unwittingly created a naming inconsistency between @normal, > @duplicate, and @zero-pages when I asked you to rename @zero to > @zero-pages. > > The meaning of the three values is not obvious, and the doc comments > don't explain them. Can you, or anybody familiar with migration, > explain them to me? > > MigrationStats return some values as bytes, some as pages, and some as > both. I hate that. Can we standardize on bytes? I added zero/zero-bytes because I thought they were not there. But it turns out "duplicate" is for that purpose. "zero/zero-bytes" is really additional information to "normal/normal-bytes". Peter suggested that if we add "zero/zero-bytes" we can slowly retire "duplicate" at a later point. I don't know the historical reason why pages/bytes are used the way it is today. The way I understand migration, the granularity of ram migration is "page". There are only two types of pages 1) normal 2) zero. Zero pages' playload are not sent through the network because we already know what it looks like. Only the page offset is sent. Normal pages are pages that are not zero. The entire page is sent through the network to the target host. if a user knows the zero/normal count, they can already calculate the zero-bytes/normal-bytes (zero/normal * page size) but it's just convenient to see both. During development, I check on these counters a lot and they are useful. > > >> > >> > # Features: > >> > # > >> > # @deprecated: Member @skipped is always zero since 1.5.3 > >> > >> [...] > >> >
Hao Xiang <hao.xiang@bytedance.com> writes: > On Wed, Feb 28, 2024 at 10:01 PM Markus Armbruster <armbru@redhat.com> wrote: >> >> Hao Xiang <hao.xiang@bytedance.com> writes: >> >> > On Wed, Feb 28, 2024 at 1:52 AM Markus Armbruster <armbru@redhat.com> wrote: >> >> >> >> Hao Xiang <hao.xiang@bytedance.com> writes: >> >> >> >> > This change extends the MigrationStatus interface to track zero pages >> >> > and zero bytes counter. >> >> > >> >> > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> >> >> >> >> [...] >> >> >> >> > diff --git a/qapi/migration.json b/qapi/migration.json >> >> > index a0a85a0312..171734c07e 100644 >> >> > --- a/qapi/migration.json >> >> > +++ b/qapi/migration.json >> >> > @@ -63,6 +63,10 @@ >> >> > # between 0 and @dirty-sync-count * @multifd-channels. (since >> >> > # 7.1) >> >> > # >> >> > +# @zero-pages: number of zero pages (since 9.0) >> >> > +# >> >> > +# @zero-bytes: number of zero bytes sent (since 9.0) >> >> > +# >> >> >> >> Awfully terse. How are these two related? >> > >> > Sorry I forgot to address the same feedback from the last version. >> >> Happens :) >> >> > zero-pages are the number of pages being detected as all "zero" and >> > hence the payload isn't sent over the network. zero-bytes is basically >> > zero-pages * page_size. It's the number of bytes migrated (but not >> > actually sent through the network) because they are all "zero". These >> > two are related to the existing interface below. normal and >> > normal-bytes are the same representation of pages who are not all >> > "zero" and are actually sent through the network. >> > >> > # @normal: number of normal pages (since 1.2) >> > # >> > # @normal-bytes: number of normal bytes sent (since 1.2) >> >> We also have >> >> # @duplicate: number of duplicate (zero) pages (since 1.2) >> # >> # @skipped: number of skipped zero pages. Always zero, only provided for >> # compatibility (since 1.5) >> >> Page skipping was introduced in 1.5, and withdrawn in 1.5.3 and 1.6. >> @skipped was formally deprecated in 8.1. It'll soon be gone, no need to >> worry about it now. >> >> That leaves three values related to pages sent: @normal (and >> @normal-bytes), @duplicate (but no @duplicate-bytes), and @zero-pages >> (and @zero-bytes). >> >> I unwittingly created a naming inconsistency between @normal, >> @duplicate, and @zero-pages when I asked you to rename @zero to >> @zero-pages. >> >> The meaning of the three values is not obvious, and the doc comments >> don't explain them. Can you, or anybody familiar with migration, >> explain them to me? >> >> MigrationStats return some values as bytes, some as pages, and some as >> both. I hate that. Can we standardize on bytes? > > I added zero/zero-bytes because I thought they were not there. But it > turns out "duplicate" is for that purpose. "zero/zero-bytes" is really > additional information to "normal/normal-bytes". Peter suggested that > if we add "zero/zero-bytes" we can slowly retire "duplicate" at a > later point. "zero" is a better name than "duplicate". Identical non-zero pages are possible, and they are duplicates, too. If you add @zero with the intent to replace @duplicate, you should immediately deprecate @duplicate. If you need assistance with that, just ask. > I don't know the historical reason why pages/bytes are used the way it > is today. The way I understand migration, the granularity of ram > migration is "page". There are only two types of pages 1) normal 2) > zero. Zero pages' playload are not sent through the network because we > already know what it looks like. Only the page offset is sent. Normal > pages are pages that are not zero. The entire page is sent through the > network to the target host. This is not at all clear from the documentation of MigrationStats. I think the documentation needs improvement there. > if a user knows the zero/normal count, > they can already calculate the zero-bytes/normal-bytes (zero/normal * > page size) Yes, because member @page-size tells them the multiplier. > but it's just convenient to see both. During development, I > check on these counters a lot and they are useful. QMP is for machines. Machines don't need or want the same quantity in two units. Providing them both bytes and pages is a design mistake. Whether it's worth correcting now is of course debatable. Regardless, the fact @normal-bytes = @normal * @page-size needs to be documented. We have # @page-size: The number of bytes per page for the various page-based # statistics (since 2.10) The fact that I inquired how zero-pages and zero-bytes are related might indicate that this isn't quite clear enough. [...]
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 7e96ae6ffd..a38ad0255d 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -111,6 +111,10 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info->ram->normal); monitor_printf(mon, "normal bytes: %" PRIu64 " kbytes\n", info->ram->normal_bytes >> 10); + monitor_printf(mon, "zero pages: %" PRIu64 " pages\n", + info->ram->zero_pages); + monitor_printf(mon, "zero bytes: %" PRIu64 " kbytes\n", + info->ram->zero_bytes >> 10); monitor_printf(mon, "dirty sync count: %" PRIu64 "\n", info->ram->dirty_sync_count); monitor_printf(mon, "page size: %" PRIu64 " kbytes\n", diff --git a/migration/migration.c b/migration/migration.c index ab21de2cad..a99f86f273 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1112,6 +1112,8 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s) info->ram->skipped = 0; info->ram->normal = stat64_get(&mig_stats.normal_pages); info->ram->normal_bytes = info->ram->normal * page_size; + info->ram->zero_pages = stat64_get(&mig_stats.zero_pages); + info->ram->zero_bytes = info->ram->zero_pages * page_size; info->ram->mbps = s->mbps; info->ram->dirty_sync_count = stat64_get(&mig_stats.dirty_sync_count); diff --git a/qapi/migration.json b/qapi/migration.json index a0a85a0312..171734c07e 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -63,6 +63,10 @@ # between 0 and @dirty-sync-count * @multifd-channels. (since # 7.1) # +# @zero-pages: number of zero pages (since 9.0) +# +# @zero-bytes: number of zero bytes sent (since 9.0) +# # Features: # # @deprecated: Member @skipped is always zero since 1.5.3 @@ -81,7 +85,8 @@ 'multifd-bytes': 'uint64', 'pages-per-second': 'uint64', 'precopy-bytes': 'uint64', 'downtime-bytes': 'uint64', 'postcopy-bytes': 'uint64', - 'dirty-sync-missed-zero-copy': 'uint64' } } + 'dirty-sync-missed-zero-copy': 'uint64', + 'zero-pages': 'int', 'zero-bytes': 'size' } } ## # @XBZRLECacheStats: @@ -332,6 +337,8 @@ # "duplicate":123, # "normal":123, # "normal-bytes":123456, +# "zero-pages":123, +# "zero-bytes":123456, # "dirty-sync-count":15 # } # } @@ -358,6 +365,8 @@ # "duplicate":123, # "normal":123, # "normal-bytes":123456, +# "zero-pages":123, +# "zero-bytes":123456, # "dirty-sync-count":15 # } # } @@ -379,6 +388,8 @@ # "duplicate":123, # "normal":123, # "normal-bytes":123456, +# "zero-pages":123, +# "zero-bytes":123456, # "dirty-sync-count":15 # }, # "disk":{ @@ -405,6 +416,8 @@ # "duplicate":10, # "normal":3333, # "normal-bytes":3412992, +# "zero-pages":3333, +# "zero-bytes":3412992, # "dirty-sync-count":15 # }, # "xbzrle-cache":{ diff --git a/tests/migration/guestperf/engine.py b/tests/migration/guestperf/engine.py index 608d7270f6..693e07c227 100644 --- a/tests/migration/guestperf/engine.py +++ b/tests/migration/guestperf/engine.py @@ -92,6 +92,8 @@ def _migrate_progress(self, vm): info["ram"].get("skipped", 0), info["ram"].get("normal", 0), info["ram"].get("normal-bytes", 0), + info["ram"].get("zero-pages", 0); + info["ram"].get("zero-bytes", 0); info["ram"].get("dirty-pages-rate", 0), info["ram"].get("mbps", 0), info["ram"].get("dirty-sync-count", 0)
This change extends the MigrationStatus interface to track zero pages and zero bytes counter. Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> --- migration/migration-hmp-cmds.c | 4 ++++ migration/migration.c | 2 ++ qapi/migration.json | 15 ++++++++++++++- tests/migration/guestperf/engine.py | 2 ++ 4 files changed, 22 insertions(+), 1 deletion(-)