diff mbox series

[v2,13/17] migration/multifd: Add migration_has_device_state_support()

Message ID 8407eb455dfc1dea3cabf065f90833fab337eb98.1724701542.git.maciej.szmigiero@oracle.com (mailing list archive)
State New, archived
Headers show
Series Multifd | expand

Commit Message

Maciej S. Szmigiero Aug. 27, 2024, 5:54 p.m. UTC
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>

Since device state transfer via multifd channels requires multifd
channels with packets and is currently not compatible with multifd
compression add an appropriate query function so device can learn
whether it can actually make use of it.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
 include/migration/misc.h         | 1 +
 migration/multifd-device-state.c | 7 +++++++
 2 files changed, 8 insertions(+)

Comments

Fabiano Rosas Aug. 30, 2024, 6:55 p.m. UTC | #1
"Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:

> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>
> Since device state transfer via multifd channels requires multifd
> channels with packets and is currently not compatible with multifd
> compression add an appropriate query function so device can learn
> whether it can actually make use of it.
>
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>

Out of curiosity, what do you see as a blocker for migrating to a file?

We would just need to figure out a mapping from file offset some unit of
data to be able to write in parallel like with ram (of which the page
offset is mapped to the file offset).

> ---
>  include/migration/misc.h         | 1 +
>  migration/multifd-device-state.c | 7 +++++++
>  2 files changed, 8 insertions(+)
>
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 7266b1b77d1f..189de6d02ad6 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -114,5 +114,6 @@ void dirty_bitmap_mig_init(void);
>  /* migration/multifd-device-state.c */
>  bool multifd_queue_device_state(char *idstr, uint32_t instance_id,
>                                  char *data, size_t len);
> +bool migration_has_device_state_support(void);
>  
>  #endif
> diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c
> index c9b44f0b5ab9..7b34fe736c7f 100644
> --- a/migration/multifd-device-state.c
> +++ b/migration/multifd-device-state.c
> @@ -11,6 +11,7 @@
>  #include "qemu/lockable.h"
>  #include "migration/misc.h"
>  #include "multifd.h"
> +#include "options.h"
>  
>  static QemuMutex queue_job_mutex;
>  
> @@ -97,3 +98,9 @@ bool multifd_queue_device_state(char *idstr, uint32_t instance_id,
>  
>      return true;
>  }
> +
> +bool migration_has_device_state_support(void)
> +{
> +    return migrate_multifd() && !migrate_mapped_ram() &&
> +        migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE;
> +}
Maciej S. Szmigiero Sept. 2, 2024, 8:11 p.m. UTC | #2
On 30.08.2024 20:55, Fabiano Rosas wrote:
> "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:
> 
>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>
>> Since device state transfer via multifd channels requires multifd
>> channels with packets and is currently not compatible with multifd
>> compression add an appropriate query function so device can learn
>> whether it can actually make use of it.
>>
>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> 
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> 
> Out of curiosity, what do you see as a blocker for migrating to a file?
> 
> We would just need to figure out a mapping from file offset some unit of
> data to be able to write in parallel like with ram (of which the page
> offset is mapped to the file offset).

I'm not sure whether there's a point in that since VFIO devices
just provide a raw device state stream - there's no way to know
that some buffer is no longer needed because it consisted of
dirty data that was completely overwritten by a later buffer.

Also, the device type that the code was developed against - a (smart)
NIC - has so large device state because (more or less) it keeps a lot
of data about network connections passing / made through it.

It doesn't really make sense to make snapshot of such device for later
reload since these connections will be long dropped by their remote
peers by this point.

Such snapshotting might make more sense with GPU VFIO devices though.

If such file migration support is desired at some later point then for
sure the whole code would need to be carefully re-checked for implicit
assumptions.

Thanks,
Maciej
Fabiano Rosas Sept. 3, 2024, 3:09 p.m. UTC | #3
"Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:

> On 30.08.2024 20:55, Fabiano Rosas wrote:
>> "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:
>> 
>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>
>>> Since device state transfer via multifd channels requires multifd
>>> channels with packets and is currently not compatible with multifd
>>> compression add an appropriate query function so device can learn
>>> whether it can actually make use of it.
>>>
>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>> 
>> Reviewed-by: Fabiano Rosas <farosas@suse.de>
>> 
>> Out of curiosity, what do you see as a blocker for migrating to a file?
>> 
>> We would just need to figure out a mapping from file offset some unit of
>> data to be able to write in parallel like with ram (of which the page
>> offset is mapped to the file offset).
>
> I'm not sure whether there's a point in that since VFIO devices
> just provide a raw device state stream - there's no way to know
> that some buffer is no longer needed because it consisted of
> dirty data that was completely overwritten by a later buffer.
>
> Also, the device type that the code was developed against - a (smart)
> NIC - has so large device state because (more or less) it keeps a lot
> of data about network connections passing / made through it.
>
> It doesn't really make sense to make snapshot of such device for later
> reload since these connections will be long dropped by their remote
> peers by this point.
>
> Such snapshotting might make more sense with GPU VFIO devices though.
>
> If such file migration support is desired at some later point then for
> sure the whole code would need to be carefully re-checked for implicit
> assumptions.

Thanks, let's keep those arguments in mind, maybe we put them in a doc
somewhere so we remember this in the future.

>
> Thanks,
> Maciej
diff mbox series

Patch

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 7266b1b77d1f..189de6d02ad6 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -114,5 +114,6 @@  void dirty_bitmap_mig_init(void);
 /* migration/multifd-device-state.c */
 bool multifd_queue_device_state(char *idstr, uint32_t instance_id,
                                 char *data, size_t len);
+bool migration_has_device_state_support(void);
 
 #endif
diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c
index c9b44f0b5ab9..7b34fe736c7f 100644
--- a/migration/multifd-device-state.c
+++ b/migration/multifd-device-state.c
@@ -11,6 +11,7 @@ 
 #include "qemu/lockable.h"
 #include "migration/misc.h"
 #include "multifd.h"
+#include "options.h"
 
 static QemuMutex queue_job_mutex;
 
@@ -97,3 +98,9 @@  bool multifd_queue_device_state(char *idstr, uint32_t instance_id,
 
     return true;
 }
+
+bool migration_has_device_state_support(void)
+{
+    return migrate_multifd() && !migrate_mapped_ram() &&
+        migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE;
+}