mbox series

[0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot

Message ID 20230828151842.11303-1-avihaih@nvidia.com (mailing list archive)
Headers show
Series vfio/migration: Block VFIO migration with postcopy and background snapshot | expand

Message

Avihai Horon Aug. 28, 2023, 3:18 p.m. UTC
Hello,

Recently added VFIO migration is not compatible with some of the
pre-existing migration features. This was overlooked and today these
combinations are not blocked by QEMU. This series fixes it.

Postcopy migration:
VFIO migration is not compatible with postcopy migration. A VFIO device
in the destination can't handle page faults for pages that have not been
sent yet. Doing such migration will cause the VM to crash in the
destination.

Background snapshot:
Background snapshot allows creating a snapshot of the VM while it's
running and keeping it small by not including dirty RAM pages.

The way it works is by first stopping the VM, saving the non-iterable
devices' state and then starting the VM and saving the RAM while write
protecting it with UFFD. The resulting snapshot represents the VM state
at snapshot start.

VFIO migration is not compatible with background snapshot.
First of all, VFIO device state is not even saved in background snapshot
because only non-iterable device state is saved. But even if it was
saved, after starting the VM, a VFIO device could dirty pages without it
being detected by UFFD write protection. This would corrupt the
snapshot, as the RAM in it would not represent the RAM at snapshot
start.

This series blocks these combinations explicitly:
If a VFIO device is added when postcopy or background snapshot are on,
a migration blocker will be added. If a VFIO device is present, setting
postcopy or background snapshot capabilities will fail with an
appropriate error message.

Note that this series is based on the P2P series [1] sent a few weeks
ago.

Comments and suggestions will be greatly appreciated.

Thanks.

[1]
https://lore.kernel.org/qemu-devel/20230802081449.2528-1-avihaih@nvidia.com/

Avihai Horon (6):
  migration: Add migration prefix to functions in target.c
  vfio/migration: Fail adding device with enable-migration=on and
    existing blocker
  vfio/migration: Add vfio_migratable_devices_num()
  vfio/migration: Change vfio_mig_active() semantics
  vfio/migration: Block VFIO migration with postcopy migration
  vfio/migration: Block VFIO migration with background snapshot

 include/hw/vfio/vfio-common.h |   4 ++
 migration/migration.h         |   7 +-
 hw/vfio/common.c              | 120 +++++++++++++++++++++++++++++-----
 hw/vfio/migration.c           |  12 ++++
 migration/migration.c         |   6 +-
 migration/options.c           |  26 ++++++++
 migration/savevm.c            |   2 +-
 migration/target.c            |  36 ++++++++--
 8 files changed, 187 insertions(+), 26 deletions(-)