mbox series

[RFC,v2,0/8] crypto,io,migration: Add support to gnutls_bye()

Message ID 20250207142758.6936-1-farosas@suse.de (mailing list archive)
Headers show
Series crypto,io,migration: Add support to gnutls_bye() | expand

Message

Fabiano Rosas Feb. 7, 2025, 2:27 p.m. UTC
v2:

Added the premature_ok logic;
Added compat property for QEMU <9.1;
Refactored the existing handshake code;

CI run:
https://gitlab.com/farosas/qemu/-/pipelines/1660800456

v1:
https://lore.kernel.org/r/20250206175824.22664-1-farosas@suse.de

Hi,

We've been discussing a way to stop multifd recv threads from getting
an error at the end of migration when the source threads close the
iochannel without ending the TLS session.

The original issue was introduced by commit 1d457daf86
("migration/multifd: Further remove the SYNC on complete") which
altered the synchronization of the source and destination in a manner
that causes the destination to already be waiting at recv() when the
source closes the connection.

One approach would be to issue gnutls_bye() at the source after all
the data has been sent. The destination would then gracefully exit
when it gets EOF.

Aside from stopping the recv thread from seeing an error, this also
creates a contract that all connections should be closed only after
the TLS session is ended. This helps to avoid masking a legitimate
issue where the connection is closed prematurely.

Fabiano Rosas (8):
  crypto: Allow gracefully ending the TLS session
  io: tls: Add qio_channel_tls_bye
  migration/multifd: Terminate the TLS connection
  migration: Check migration error after loadvm
  crypto: Remove qcrypto_tls_session_get_handshake_status
  io: Plumb read flags into qio_channel_read_all_eof
  io: Add a read flag for relaxed EOF
  migration/multifd: Add a compat property for TLS termination

 crypto/tlssession.c                 | 105 +++++++++++++++++-----------
 hw/remote/mpqemu-link.c             |   2 +-
 include/crypto/tlssession.h         |  46 ++++++------
 include/io/channel-tls.h            |  12 ++++
 include/io/channel.h                |   7 ++
 io/channel-tls.c                    |  92 +++++++++++++++++++++++-
 io/channel.c                        |  13 ++--
 io/trace-events                     |   5 ++
 migration/migration.h               |  33 +++++++++
 migration/multifd.c                 |  42 ++++++++++-
 migration/multifd.h                 |   2 +
 migration/options.c                 |   2 +
 migration/savevm.c                  |   6 +-
 migration/tls.c                     |   5 ++
 migration/tls.h                     |   2 +-
 tests/unit/test-crypto-tlssession.c |  12 ++--
 tools/i386/qemu-vmsr-helper.c       |   3 +-
 util/vhost-user-server.c            |   2 +-
 18 files changed, 308 insertions(+), 83 deletions(-)

Comments

Maciej S. Szmigiero Feb. 7, 2025, 7:44 p.m. UTC | #1
On 7.02.2025 15:27, Fabiano Rosas wrote:
> v2:
> 
> Added the premature_ok logic;
> Added compat property for QEMU <9.1;
> Refactored the existing handshake code;
> 
> CI run:
> https://gitlab.com/farosas/qemu/-/pipelines/1660800456
> 
> v1:
> https://lore.kernel.org/r/20250206175824.22664-1-farosas@suse.de
> 
> Hi,
> 
> We've been discussing a way to stop multifd recv threads from getting
> an error at the end of migration when the source threads close the
> iochannel without ending the TLS session.
> 
> The original issue was introduced by commit 1d457daf86
> ("migration/multifd: Further remove the SYNC on complete") which
> altered the synchronization of the source and destination in a manner
> that causes the destination to already be waiting at recv() when the
> source closes the connection.
> 
> One approach would be to issue gnutls_bye() at the source after all
> the data has been sent. The destination would then gracefully exit
> when it gets EOF.
> 
> Aside from stopping the recv thread from seeing an error, this also
> creates a contract that all connections should be closed only after
> the TLS session is ended. This helps to avoid masking a legitimate
> issue where the connection is closed prematurely.
> 

I've rebased my patch set on top of this version and can confirm
it works too (with respect to VFIO migration and QEMU tests).

The updated series is available at its usual place.

Thanks,
Maciej