Message ID | 20220426230654.637939-1-leobras@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | MSG_ZEROCOPY + multifd | expand |
* Leonardo Bras (leobras@redhat.com) wrote: > This patch series intends to enable MSG_ZEROCOPY in QIOChannel, and make > use of it for multifd migration performance improvement, by reducing cpu > usage. > > Patch #1 creates new callbacks for QIOChannel, allowing the implementation > of zero copy writing. > > Patch #2 implements io_writev flags and io_flush() on QIOChannelSocket, > making use of MSG_ZEROCOPY on Linux. > > Patch #3 adds a "zero_copy_send" migration property, only available with > CONFIG_LINUX, and compiled-out in any other architectures. > This migration property has to be enabled before multifd migration starts. > > Patch #4 adds a helper function that allows to see if TLS is going to be used. > This helper will be later used in patch #5. > > Patch #5 changes multifd_send_sync_main() so it returns int instead of void. > The return value is used to understand if any error happened in the function, > allowing migration to possible fail earlier. > > Patch #6 implements an workaround: The behavior introduced in d48c3a0445 is > hard to deal with in zerocopy, so a workaround is introduced to send the > header in a different syscall, without MSG_ZEROCOPY. > > Patch #7 Makes use of QIOChannelSocket zero_copy implementation on > nocomp multifd migration. Queued. > Results: > In preliminary tests, the resource usage of __sys_sendmsg() reduced 15 times, > and the overall migration took 13-22% less time, based in synthetic cpu > workload. > > In further tests, it was noted that, on multifd migration with 8 channels: > - On idle hosts, migration time reduced in 10% to 21%. > - On hosts busy with heavy cpu stress (1 stress thread per cpu, but > not cpu-pinned) migration time reduced in ~25% by enabling zero-copy. > - On hosts with heavy cpu-pinned workloads (1 stress thread per cpu, > cpu-pinned), migration time reducted in ~66% by enabling zero-copy. Nice. > Above tests setup: > - Sending and Receiving hosts: > - CPU : Intel(R) Xeon(R) Platinum 8276L CPU @ 2.20GHz (448 CPUS) > - Network card: E810-C (100Gbps) > - >1TB RAM > - QEMU: Upstream master branch + This patchset > - Linux: Upstream v5.15 That configuration is particularly interesting because while it's a big machine with lots of cores, the individual cores are clocked relatively slowly; also having lots of cores probably means they're all fighting over memory bandwidth, so the less copies the better. Dave > - VM configuration: > - 28 VCPUs > - 512GB RAM > > > --- > Changes since v9: > - Patch #6 got simplified and improved (thanks Daniel) > - Patch #7 got better comments (thanks Peter Xu) > > Changes since v8: > - Inserted two new patches #5 & #6, previous patch #5 is now #7. > - Workaround an optimization introduced in d48c3a0445 > - Removed unnecessary assert in qio_channel_writev_full_all > > Changes since v7: > - Migration property renamed from zero-copy to zero-copy-send > - A few early tests added to help misconfigurations to fail earlier > - qio_channel_full*_flags() renamed back to qio_channel_full*() > - multifd_send_sync_main() reverted back to not receiving a flag, > so it always sync zero-copy when enabled. > - Improve code quality on a few points > > Changes since v6: > - Remove io_writev_zero_copy(), and makes use of io_writev() new flags > to achieve the same results. > - Rename io_flush_zero_copy() to io_flush() > - Previous patch #2 became too small, so it was squashed in previous > patch #3 (now patch #2) > > Changes since v5: > - flush_zero_copy now returns -1 on fail, 0 on success, and 1 when all > processed writes were not able to use zerocopy in kernel. > - qio_channel_socket_poll() removed, using qio_channel_wait() instead > - ENOBUFS is now processed inside qio_channel_socket_writev_flags() > - Most zerocopy parameter validation moved to migrate_params_check(), > leaving only feature test to socket_outgoing_migration() callback > - Naming went from *zerocopy to *zero_copy or *zero-copy, due to QAPI/QMP > preferences > - Improved docs > > Changes since v4: > - 3 patches got splitted in 6 > - Flush is used for syncing after each iteration, instead of only at the end > - If zerocopy is not available, fail in connect instead of failing on write > - 'multifd-zerocopy' property renamed to 'zerocopy' > - Fail migrations that don't support zerocopy, if it's enabled. > - Instead of checking for zerocopy at each write, save the flags in > MultiFDSendParams->write_flags and use them on write > - Reorganized flag usage in QIOChannelSocket > - A lot of typos fixed > - More doc on buffer restrictions > > Changes since v3: > - QIOChannel interface names changed from io_async_{writev,flush} to > io_{writev,flush}_zerocopy > - Instead of falling back in case zerocopy is not implemented, return > error and abort operation. > - Flush now waits as long as needed, or return error in case anything > goes wrong, aborting the operation. > - Zerocopy is now conditional in multifd, being set by parameter > multifd-zerocopy > - Moves zerocopy_flush to multifd_send_sync_main() from multifd_save_cleanup > so migration can abort if flush goes wrong. > - Several other small improvements > > Changes since v2: > - Patch #1: One more fallback > - Patch #2: Fall back to sync if fails to lock buffer memory in MSG_ZEROCOPY send. > > Changes since v1: > - Reimplemented the patchset using async_write + async_flush approach. > - Implemented a flush to be able to tell whenever all data was written. > > Leonardo Bras (7): > QIOChannel: Add flags on io_writev and introduce io_flush callback > QIOChannelSocket: Implement io_writev zero copy flag & io_flush for > CONFIG_LINUX > migration: Add zero-copy-send parameter for QMP/HMP for Linux > migration: Add migrate_use_tls() helper > multifd: multifd_send_sync_main now returns negative on error > multifd: Send header packet without flags if zero-copy-send is enabled > multifd: Implement zero copy write in multifd migration > (multifd-zero-copy) > > qapi/migration.json | 24 ++++++ > include/io/channel-socket.h | 2 + > include/io/channel.h | 38 +++++++++- > migration/migration.h | 6 ++ > migration/multifd.h | 4 +- > chardev/char-io.c | 2 +- > hw/remote/mpqemu-link.c | 2 +- > io/channel-buffer.c | 1 + > io/channel-command.c | 1 + > io/channel-file.c | 1 + > io/channel-socket.c | 110 +++++++++++++++++++++++++++- > io/channel-tls.c | 1 + > io/channel-websock.c | 1 + > io/channel.c | 49 ++++++++++--- > migration/channel.c | 3 +- > migration/migration.c | 52 ++++++++++++- > migration/multifd.c | 75 +++++++++++++++---- > migration/ram.c | 29 ++++++-- > migration/rdma.c | 1 + > migration/socket.c | 12 ++- > monitor/hmp-cmds.c | 6 ++ > scsi/pr-manager-helper.c | 2 +- > tests/unit/test-io-channel-socket.c | 1 + > 23 files changed, 379 insertions(+), 44 deletions(-) > > -- > 2.36.0 > > From c6fda6f8fb29ceeaf4d36f3787b85196e9bb281f Mon Sep 17 00:00:00 2001 > From: Leonardo Bras <leobras@redhat.com> > Date: Mon, 25 Apr 2022 17:45:14 -0300 > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > To: "Marc-André Lureau" <marcandre.lureau@redhat.com>,Paolo Bonzini <pbonzini@redhat.com>,Elena Ufimtseva <elena.ufimtseva@oracle.com>,Jagannathan Raman <jag.raman@oracle.com>,John G Johnson <john.g.johnson@oracle.com>,"Daniel P. Berrangé" <berrange@redhat.com>,Juan Quintela <quintela@redhat.com>,"Dr. David Alan Gilbert" <dgilbert@redhat.com>,Eric Blake <eblake@redhat.com>,Markus Armbruster <armbru@redhat.com>,Fam Zheng <fam@euphon.net>,Peter Xu <peterx@redhat.com> > Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, > > > Leonardo Bras (7): > QIOChannel: Add flags on io_writev and introduce io_flush callback > QIOChannelSocket: Implement io_writev zero copy flag & io_flush for > CONFIG_LINUX > migration: Add zero-copy-send parameter for QMP/HMP for Linux > migration: Add migrate_use_tls() helper > multifd: multifd_send_sync_main now returns negative on error > multifd: Send header packet without flags if zero-copy-send is enabled > multifd: Implement zero copy write in multifd migration > (multifd-zero-copy) > > qapi/migration.json | 24 ++++++ > include/io/channel-socket.h | 2 + > include/io/channel.h | 38 +++++++++- > migration/migration.h | 6 ++ > migration/multifd.h | 4 +- > chardev/char-io.c | 2 +- > hw/remote/mpqemu-link.c | 2 +- > io/channel-buffer.c | 1 + > io/channel-command.c | 1 + > io/channel-file.c | 1 + > io/channel-socket.c | 110 +++++++++++++++++++++++++++- > io/channel-tls.c | 1 + > io/channel-websock.c | 1 + > io/channel.c | 49 ++++++++++--- > migration/channel.c | 3 +- > migration/migration.c | 52 ++++++++++++- > migration/multifd.c | 72 +++++++++++++++--- > migration/ram.c | 29 ++++++-- > migration/rdma.c | 1 + > migration/socket.c | 12 ++- > monitor/hmp-cmds.c | 6 ++ > scsi/pr-manager-helper.c | 2 +- > tests/unit/test-io-channel-socket.c | 1 + > 23 files changed, 377 insertions(+), 43 deletions(-) > > -- > 2.36.0 >
On Thu, Apr 28, 2022 at 11:08 AM Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > > * Leonardo Bras (leobras@redhat.com) wrote: > > This patch series intends to enable MSG_ZEROCOPY in QIOChannel, and make > > use of it for multifd migration performance improvement, by reducing cpu > > usage. > > > > Patch #1 creates new callbacks for QIOChannel, allowing the implementation > > of zero copy writing. > > > > Patch #2 implements io_writev flags and io_flush() on QIOChannelSocket, > > making use of MSG_ZEROCOPY on Linux. > > > > Patch #3 adds a "zero_copy_send" migration property, only available with > > CONFIG_LINUX, and compiled-out in any other architectures. > > This migration property has to be enabled before multifd migration starts. > > > > Patch #4 adds a helper function that allows to see if TLS is going to be used. > > This helper will be later used in patch #5. > > > > Patch #5 changes multifd_send_sync_main() so it returns int instead of void. > > The return value is used to understand if any error happened in the function, > > allowing migration to possible fail earlier. > > > > Patch #6 implements an workaround: The behavior introduced in d48c3a0445 is > > hard to deal with in zerocopy, so a workaround is introduced to send the > > header in a different syscall, without MSG_ZEROCOPY. > > > > Patch #7 Makes use of QIOChannelSocket zero_copy implementation on > > nocomp multifd migration. > > Queued. > > > Results: > > In preliminary tests, the resource usage of __sys_sendmsg() reduced 15 times, > > and the overall migration took 13-22% less time, based in synthetic cpu > > workload. > > > > In further tests, it was noted that, on multifd migration with 8 channels: > > - On idle hosts, migration time reduced in 10% to 21%. > > - On hosts busy with heavy cpu stress (1 stress thread per cpu, but > > not cpu-pinned) migration time reduced in ~25% by enabling zero-copy. > > - On hosts with heavy cpu-pinned workloads (1 stress thread per cpu, > > cpu-pinned), migration time reducted in ~66% by enabling zero-copy. > > Nice. > > > Above tests setup: > > - Sending and Receiving hosts: > > - CPU : Intel(R) Xeon(R) Platinum 8276L CPU @ 2.20GHz (448 CPUS) > > - Network card: E810-C (100Gbps) > > - >1TB RAM > > - QEMU: Upstream master branch + This patchset > > - Linux: Upstream v5.15 > > That configuration is particularly interesting because while it's a big > machine with lots of cores, the individual cores are clocked relatively > slowly; also having lots of cores probably means they're all fighting > over memory bandwidth, so the less copies the better. > > Dave > Thanks Dave! Best regards, Leo > > - VM configuration: > > - 28 VCPUs > > - 512GB RAM > > > > > > --- > > Changes since v9: > > - Patch #6 got simplified and improved (thanks Daniel) > > - Patch #7 got better comments (thanks Peter Xu) > > > > Changes since v8: > > - Inserted two new patches #5 & #6, previous patch #5 is now #7. > > - Workaround an optimization introduced in d48c3a0445 > > - Removed unnecessary assert in qio_channel_writev_full_all > > > > Changes since v7: > > - Migration property renamed from zero-copy to zero-copy-send > > - A few early tests added to help misconfigurations to fail earlier > > - qio_channel_full*_flags() renamed back to qio_channel_full*() > > - multifd_send_sync_main() reverted back to not receiving a flag, > > so it always sync zero-copy when enabled. > > - Improve code quality on a few points > > > > Changes since v6: > > - Remove io_writev_zero_copy(), and makes use of io_writev() new flags > > to achieve the same results. > > - Rename io_flush_zero_copy() to io_flush() > > - Previous patch #2 became too small, so it was squashed in previous > > patch #3 (now patch #2) > > > > Changes since v5: > > - flush_zero_copy now returns -1 on fail, 0 on success, and 1 when all > > processed writes were not able to use zerocopy in kernel. > > - qio_channel_socket_poll() removed, using qio_channel_wait() instead > > - ENOBUFS is now processed inside qio_channel_socket_writev_flags() > > - Most zerocopy parameter validation moved to migrate_params_check(), > > leaving only feature test to socket_outgoing_migration() callback > > - Naming went from *zerocopy to *zero_copy or *zero-copy, due to QAPI/QMP > > preferences > > - Improved docs > > > > Changes since v4: > > - 3 patches got splitted in 6 > > - Flush is used for syncing after each iteration, instead of only at the end > > - If zerocopy is not available, fail in connect instead of failing on write > > - 'multifd-zerocopy' property renamed to 'zerocopy' > > - Fail migrations that don't support zerocopy, if it's enabled. > > - Instead of checking for zerocopy at each write, save the flags in > > MultiFDSendParams->write_flags and use them on write > > - Reorganized flag usage in QIOChannelSocket > > - A lot of typos fixed > > - More doc on buffer restrictions > > > > Changes since v3: > > - QIOChannel interface names changed from io_async_{writev,flush} to > > io_{writev,flush}_zerocopy > > - Instead of falling back in case zerocopy is not implemented, return > > error and abort operation. > > - Flush now waits as long as needed, or return error in case anything > > goes wrong, aborting the operation. > > - Zerocopy is now conditional in multifd, being set by parameter > > multifd-zerocopy > > - Moves zerocopy_flush to multifd_send_sync_main() from multifd_save_cleanup > > so migration can abort if flush goes wrong. > > - Several other small improvements > > > > Changes since v2: > > - Patch #1: One more fallback > > - Patch #2: Fall back to sync if fails to lock buffer memory in MSG_ZEROCOPY send. > > > > Changes since v1: > > - Reimplemented the patchset using async_write + async_flush approach. > > - Implemented a flush to be able to tell whenever all data was written. > > > > Leonardo Bras (7): > > QIOChannel: Add flags on io_writev and introduce io_flush callback > > QIOChannelSocket: Implement io_writev zero copy flag & io_flush for > > CONFIG_LINUX > > migration: Add zero-copy-send parameter for QMP/HMP for Linux > > migration: Add migrate_use_tls() helper > > multifd: multifd_send_sync_main now returns negative on error > > multifd: Send header packet without flags if zero-copy-send is enabled > > multifd: Implement zero copy write in multifd migration > > (multifd-zero-copy) > > > > qapi/migration.json | 24 ++++++ > > include/io/channel-socket.h | 2 + > > include/io/channel.h | 38 +++++++++- > > migration/migration.h | 6 ++ > > migration/multifd.h | 4 +- > > chardev/char-io.c | 2 +- > > hw/remote/mpqemu-link.c | 2 +- > > io/channel-buffer.c | 1 + > > io/channel-command.c | 1 + > > io/channel-file.c | 1 + > > io/channel-socket.c | 110 +++++++++++++++++++++++++++- > > io/channel-tls.c | 1 + > > io/channel-websock.c | 1 + > > io/channel.c | 49 ++++++++++--- > > migration/channel.c | 3 +- > > migration/migration.c | 52 ++++++++++++- > > migration/multifd.c | 75 +++++++++++++++---- > > migration/ram.c | 29 ++++++-- > > migration/rdma.c | 1 + > > migration/socket.c | 12 ++- > > monitor/hmp-cmds.c | 6 ++ > > scsi/pr-manager-helper.c | 2 +- > > tests/unit/test-io-channel-socket.c | 1 + > > 23 files changed, 379 insertions(+), 44 deletions(-) > > > > -- > > 2.36.0 > > > > From c6fda6f8fb29ceeaf4d36f3787b85196e9bb281f Mon Sep 17 00:00:00 2001 > > From: Leonardo Bras <leobras@redhat.com> > > Date: Mon, 25 Apr 2022 17:45:14 -0300 > > MIME-Version: 1.0 > > Content-Type: text/plain; charset=UTF-8 > > Content-Transfer-Encoding: 8bit > > To: "Marc-André Lureau" <marcandre.lureau@redhat.com>,Paolo Bonzini <pbonzini@redhat.com>,Elena Ufimtseva <elena.ufimtseva@oracle.com>,Jagannathan Raman <jag.raman@oracle.com>,John G Johnson <john.g.johnson@oracle.com>,"Daniel P. Berrangé" <berrange@redhat.com>,Juan Quintela <quintela@redhat.com>,"Dr. David Alan Gilbert" <dgilbert@redhat.com>,Eric Blake <eblake@redhat.com>,Markus Armbruster <armbru@redhat.com>,Fam Zheng <fam@euphon.net>,Peter Xu <peterx@redhat.com> > > Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, > > > > > > Leonardo Bras (7): > > QIOChannel: Add flags on io_writev and introduce io_flush callback > > QIOChannelSocket: Implement io_writev zero copy flag & io_flush for > > CONFIG_LINUX > > migration: Add zero-copy-send parameter for QMP/HMP for Linux > > migration: Add migrate_use_tls() helper > > multifd: multifd_send_sync_main now returns negative on error > > multifd: Send header packet without flags if zero-copy-send is enabled > > multifd: Implement zero copy write in multifd migration > > (multifd-zero-copy) > > > > qapi/migration.json | 24 ++++++ > > include/io/channel-socket.h | 2 + > > include/io/channel.h | 38 +++++++++- > > migration/migration.h | 6 ++ > > migration/multifd.h | 4 +- > > chardev/char-io.c | 2 +- > > hw/remote/mpqemu-link.c | 2 +- > > io/channel-buffer.c | 1 + > > io/channel-command.c | 1 + > > io/channel-file.c | 1 + > > io/channel-socket.c | 110 +++++++++++++++++++++++++++- > > io/channel-tls.c | 1 + > > io/channel-websock.c | 1 + > > io/channel.c | 49 ++++++++++--- > > migration/channel.c | 3 +- > > migration/migration.c | 52 ++++++++++++- > > migration/multifd.c | 72 +++++++++++++++--- > > migration/ram.c | 29 ++++++-- > > migration/rdma.c | 1 + > > migration/socket.c | 12 ++- > > monitor/hmp-cmds.c | 6 ++ > > scsi/pr-manager-helper.c | 2 +- > > tests/unit/test-io-channel-socket.c | 1 + > > 23 files changed, 377 insertions(+), 43 deletions(-) > > > > -- > > 2.36.0 > > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >