Message ID | 20250207142758.6936-4-farosas@suse.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | crypto,io,migration: Add support to gnutls_bye() | expand |
On Fri, Feb 07, 2025 at 11:27:53AM -0300, Fabiano Rosas wrote: > The multifd recv side has been getting a TLS error of > GNUTLS_E_PREMATURE_TERMINATION at the end of migration when the send > side closes the sockets without ending the TLS session. This has been > masked by the code not checking the migration error after loadvm. > > Start ending the TLS session at multifd_send_shutdown() so the recv > side always sees a clean termination (EOF) and we can start to > differentiate that from an actual premature termination that might > possibly happen in the middle of the migration. > > There's nothing to be done if a previous migration error has already > broken the connection, so add a comment explaining it and ignore any > errors coming from gnutls_bye(). > > This doesn't break compat with older recv-side QEMUs because EOF has > always caused the recv thread to exit cleanly. > > Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> One trivial comment.. > --- > migration/multifd.c | 34 +++++++++++++++++++++++++++++++++- > migration/tls.c | 5 +++++ > migration/tls.h | 2 +- > 3 files changed, 39 insertions(+), 2 deletions(-) > > diff --git a/migration/multifd.c b/migration/multifd.c > index ab73d6d984..b57cad3bb1 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -490,6 +490,32 @@ void multifd_send_shutdown(void) > return; > } > > + for (i = 0; i < migrate_multifd_channels(); i++) { > + MultiFDSendParams *p = &multifd_send_state->params[i]; > + > + /* thread_created implies the TLS handshake has succeeded */ > + if (p->tls_thread_created && p->thread_created) { > + Error *local_err = NULL; > + /* > + * The destination expects the TLS session to always be > + * properly terminated. This helps to detect a premature > + * termination in the middle of the stream. Note that > + * older QEMUs always break the connection on the source > + * and the destination always sees > + * GNUTLS_E_PREMATURE_TERMINATION. > + */ > + migration_tls_channel_end(p->c, &local_err); > + > + if (local_err) { > + /* > + * The above can fail with broken pipe due to a > + * previous migration error, ignore the error. > + */ > + assert(migration_has_failed(migrate_get_current())); Considering this is still src, do we want to be softer on this by error_report? Logically !migration_has_failed() means it succeeded, so we can throw src qemu way now, that shouldn't be a huge deal. More of thinking out loud kind of comment.. Your call. > + } > + } > + } > + > multifd_send_terminate_threads(); > > for (i = 0; i < migrate_multifd_channels(); i++) { > @@ -1141,7 +1167,13 @@ static void *multifd_recv_thread(void *opaque) > > ret = qio_channel_read_all_eof(p->c, (void *)p->packet, > p->packet_len, &local_err); > - if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */ > + if (!ret) { > + /* EOF */ > + assert(!local_err); > + break; > + } > + > + if (ret == -1) { > break; > } > > diff --git a/migration/tls.c b/migration/tls.c > index fa03d9136c..5cbf952383 100644 > --- a/migration/tls.c > +++ b/migration/tls.c > @@ -156,6 +156,11 @@ void migration_tls_channel_connect(MigrationState *s, > NULL); > } > > +void migration_tls_channel_end(QIOChannel *ioc, Error **errp) > +{ > + qio_channel_tls_bye(QIO_CHANNEL_TLS(ioc), errp); > +} > + > bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc) > { > if (!migrate_tls()) { > diff --git a/migration/tls.h b/migration/tls.h > index 5797d153cb..58b25e1228 100644 > --- a/migration/tls.h > +++ b/migration/tls.h > @@ -36,7 +36,7 @@ void migration_tls_channel_connect(MigrationState *s, > QIOChannel *ioc, > const char *hostname, > Error **errp); > - > +void migration_tls_channel_end(QIOChannel *ioc, Error **errp); > /* Whether the QIO channel requires further TLS handshake? */ > bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc); > > -- > 2.35.3 >
Peter Xu <peterx@redhat.com> writes: > On Fri, Feb 07, 2025 at 11:27:53AM -0300, Fabiano Rosas wrote: >> The multifd recv side has been getting a TLS error of >> GNUTLS_E_PREMATURE_TERMINATION at the end of migration when the send >> side closes the sockets without ending the TLS session. This has been >> masked by the code not checking the migration error after loadvm. >> >> Start ending the TLS session at multifd_send_shutdown() so the recv >> side always sees a clean termination (EOF) and we can start to >> differentiate that from an actual premature termination that might >> possibly happen in the middle of the migration. >> >> There's nothing to be done if a previous migration error has already >> broken the connection, so add a comment explaining it and ignore any >> errors coming from gnutls_bye(). >> >> This doesn't break compat with older recv-side QEMUs because EOF has >> always caused the recv thread to exit cleanly. >> >> Signed-off-by: Fabiano Rosas <farosas@suse.de> > > Reviewed-by: Peter Xu <peterx@redhat.com> > > One trivial comment.. > >> --- >> migration/multifd.c | 34 +++++++++++++++++++++++++++++++++- >> migration/tls.c | 5 +++++ >> migration/tls.h | 2 +- >> 3 files changed, 39 insertions(+), 2 deletions(-) >> >> diff --git a/migration/multifd.c b/migration/multifd.c >> index ab73d6d984..b57cad3bb1 100644 >> --- a/migration/multifd.c >> +++ b/migration/multifd.c >> @@ -490,6 +490,32 @@ void multifd_send_shutdown(void) >> return; >> } >> >> + for (i = 0; i < migrate_multifd_channels(); i++) { >> + MultiFDSendParams *p = &multifd_send_state->params[i]; >> + >> + /* thread_created implies the TLS handshake has succeeded */ >> + if (p->tls_thread_created && p->thread_created) { >> + Error *local_err = NULL; >> + /* >> + * The destination expects the TLS session to always be >> + * properly terminated. This helps to detect a premature >> + * termination in the middle of the stream. Note that >> + * older QEMUs always break the connection on the source >> + * and the destination always sees >> + * GNUTLS_E_PREMATURE_TERMINATION. >> + */ >> + migration_tls_channel_end(p->c, &local_err); >> + >> + if (local_err) { >> + /* >> + * The above can fail with broken pipe due to a >> + * previous migration error, ignore the error. >> + */ >> + assert(migration_has_failed(migrate_get_current())); > > Considering this is still src, do we want to be softer on this by > error_report? > > Logically !migration_has_failed() means it succeeded, so we can throw src > qemu way now, that shouldn't be a huge deal. More of thinking out loud kind > of comment.. Your call. > Maybe even a warning? If at this point migration succeeded, it's probably best to let cleanup carry on. >> + } >> + } >> + } >> + >> multifd_send_terminate_threads(); >> >> for (i = 0; i < migrate_multifd_channels(); i++) { >> @@ -1141,7 +1167,13 @@ static void *multifd_recv_thread(void *opaque) >> >> ret = qio_channel_read_all_eof(p->c, (void *)p->packet, >> p->packet_len, &local_err); >> - if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */ >> + if (!ret) { >> + /* EOF */ >> + assert(!local_err); >> + break; >> + } >> + >> + if (ret == -1) { >> break; >> } >> >> diff --git a/migration/tls.c b/migration/tls.c >> index fa03d9136c..5cbf952383 100644 >> --- a/migration/tls.c >> +++ b/migration/tls.c >> @@ -156,6 +156,11 @@ void migration_tls_channel_connect(MigrationState *s, >> NULL); >> } >> >> +void migration_tls_channel_end(QIOChannel *ioc, Error **errp) >> +{ >> + qio_channel_tls_bye(QIO_CHANNEL_TLS(ioc), errp); >> +} >> + >> bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc) >> { >> if (!migrate_tls()) { >> diff --git a/migration/tls.h b/migration/tls.h >> index 5797d153cb..58b25e1228 100644 >> --- a/migration/tls.h >> +++ b/migration/tls.h >> @@ -36,7 +36,7 @@ void migration_tls_channel_connect(MigrationState *s, >> QIOChannel *ioc, >> const char *hostname, >> Error **errp); >> - >> +void migration_tls_channel_end(QIOChannel *ioc, Error **errp); >> /* Whether the QIO channel requires further TLS handshake? */ >> bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc); >> >> -- >> 2.35.3 >>
On Fri, Feb 07, 2025 at 03:15:48PM -0300, Fabiano Rosas wrote: > >> + for (i = 0; i < migrate_multifd_channels(); i++) { > >> + MultiFDSendParams *p = &multifd_send_state->params[i]; > >> + > >> + /* thread_created implies the TLS handshake has succeeded */ > >> + if (p->tls_thread_created && p->thread_created) { > >> + Error *local_err = NULL; > >> + /* > >> + * The destination expects the TLS session to always be > >> + * properly terminated. This helps to detect a premature > >> + * termination in the middle of the stream. Note that > >> + * older QEMUs always break the connection on the source > >> + * and the destination always sees > >> + * GNUTLS_E_PREMATURE_TERMINATION. > >> + */ > >> + migration_tls_channel_end(p->c, &local_err); > >> + > >> + if (local_err) { > >> + /* > >> + * The above can fail with broken pipe due to a > >> + * previous migration error, ignore the error. > >> + */ > >> + assert(migration_has_failed(migrate_get_current())); > > > > Considering this is still src, do we want to be softer on this by > > error_report? > > > > Logically !migration_has_failed() means it succeeded, so we can throw src > > qemu way now, that shouldn't be a huge deal. More of thinking out loud kind > > of comment.. Your call. > > > > Maybe even a warning? If at this point migration succeeded, it's probably > best to let cleanup carry on. Yep, warning sounds good too.
diff --git a/migration/multifd.c b/migration/multifd.c index ab73d6d984..b57cad3bb1 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -490,6 +490,32 @@ void multifd_send_shutdown(void) return; } + for (i = 0; i < migrate_multifd_channels(); i++) { + MultiFDSendParams *p = &multifd_send_state->params[i]; + + /* thread_created implies the TLS handshake has succeeded */ + if (p->tls_thread_created && p->thread_created) { + Error *local_err = NULL; + /* + * The destination expects the TLS session to always be + * properly terminated. This helps to detect a premature + * termination in the middle of the stream. Note that + * older QEMUs always break the connection on the source + * and the destination always sees + * GNUTLS_E_PREMATURE_TERMINATION. + */ + migration_tls_channel_end(p->c, &local_err); + + if (local_err) { + /* + * The above can fail with broken pipe due to a + * previous migration error, ignore the error. + */ + assert(migration_has_failed(migrate_get_current())); + } + } + } + multifd_send_terminate_threads(); for (i = 0; i < migrate_multifd_channels(); i++) { @@ -1141,7 +1167,13 @@ static void *multifd_recv_thread(void *opaque) ret = qio_channel_read_all_eof(p->c, (void *)p->packet, p->packet_len, &local_err); - if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */ + if (!ret) { + /* EOF */ + assert(!local_err); + break; + } + + if (ret == -1) { break; } diff --git a/migration/tls.c b/migration/tls.c index fa03d9136c..5cbf952383 100644 --- a/migration/tls.c +++ b/migration/tls.c @@ -156,6 +156,11 @@ void migration_tls_channel_connect(MigrationState *s, NULL); } +void migration_tls_channel_end(QIOChannel *ioc, Error **errp) +{ + qio_channel_tls_bye(QIO_CHANNEL_TLS(ioc), errp); +} + bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc) { if (!migrate_tls()) { diff --git a/migration/tls.h b/migration/tls.h index 5797d153cb..58b25e1228 100644 --- a/migration/tls.h +++ b/migration/tls.h @@ -36,7 +36,7 @@ void migration_tls_channel_connect(MigrationState *s, QIOChannel *ioc, const char *hostname, Error **errp); - +void migration_tls_channel_end(QIOChannel *ioc, Error **errp); /* Whether the QIO channel requires further TLS handshake? */ bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc);
The multifd recv side has been getting a TLS error of GNUTLS_E_PREMATURE_TERMINATION at the end of migration when the send side closes the sockets without ending the TLS session. This has been masked by the code not checking the migration error after loadvm. Start ending the TLS session at multifd_send_shutdown() so the recv side always sees a clean termination (EOF) and we can start to differentiate that from an actual premature termination that might possibly happen in the middle of the migration. There's nothing to be done if a previous migration error has already broken the connection, so add a comment explaining it and ignore any errors coming from gnutls_bye(). This doesn't break compat with older recv-side QEMUs because EOF has always caused the recv thread to exit cleanly. Signed-off-by: Fabiano Rosas <farosas@suse.de> --- migration/multifd.c | 34 +++++++++++++++++++++++++++++++++- migration/tls.c | 5 +++++ migration/tls.h | 2 +- 3 files changed, 39 insertions(+), 2 deletions(-)