diff mbox series

[v6,16/19] softmmu/vl: defer backend init

Message ID 55fa22ea0e82b204ca3c5ee2fc4b9b3d2c1669f6.1645079934.git.jag.raman@oracle.com (mailing list archive)
State New, archived
Headers show
Series vfio-user server in QEMU | expand

Commit Message

Jag Raman Feb. 17, 2022, 7:49 a.m. UTC
Allow deferred initialization of backends. TYPE_REMOTE_MACHINE is
agnostic to QEMU's RUN_STATE. It's state is driven by the QEMU client
via the vfio-user protocol. Whereas, the backends presently defer
initialization if QEMU is in RUN_STATE_INMIGRATE. Since the remote
machine can't use RUN_STATE*, this commit allows it to ask for deferred
initialization of backend device. It is primarily targeted towards block
devices in this commit, but it needed not be limited to that.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 include/sysemu/sysemu.h    |  4 ++++
 block/block-backend.c      |  3 ++-
 blockdev.c                 |  2 +-
 softmmu/vl.c               | 17 +++++++++++++++++
 stubs/defer-backend-init.c |  7 +++++++
 MAINTAINERS                |  1 +
 stubs/meson.build          |  1 +
 7 files changed, 33 insertions(+), 2 deletions(-)
 create mode 100644 stubs/defer-backend-init.c

Comments

Stefan Hajnoczi March 7, 2022, 10:48 a.m. UTC | #1
On Thu, Feb 17, 2022 at 02:49:03AM -0500, Jagannathan Raman wrote:
> Allow deferred initialization of backends. TYPE_REMOTE_MACHINE is
> agnostic to QEMU's RUN_STATE. It's state is driven by the QEMU client

s/It's/Its/

> via the vfio-user protocol. Whereas, the backends presently defer
> initialization if QEMU is in RUN_STATE_INMIGRATE. Since the remote
> machine can't use RUN_STATE*, this commit allows it to ask for deferred
> initialization of backend device. It is primarily targeted towards block
> devices in this commit, but it needed not be limited to that.

What is the purpose of this commit? I don't understand the description.

> 
> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> ---
>  include/sysemu/sysemu.h    |  4 ++++
>  block/block-backend.c      |  3 ++-
>  blockdev.c                 |  2 +-
>  softmmu/vl.c               | 17 +++++++++++++++++
>  stubs/defer-backend-init.c |  7 +++++++
>  MAINTAINERS                |  1 +
>  stubs/meson.build          |  1 +
>  7 files changed, 33 insertions(+), 2 deletions(-)
>  create mode 100644 stubs/defer-backend-init.c
> 
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index b9421e03ff..3179eb1857 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -119,4 +119,8 @@ extern QemuOptsList qemu_net_opts;
>  extern QemuOptsList qemu_global_opts;
>  extern QemuOptsList qemu_semihosting_config_opts;
>  
> +bool deferred_backend_init(void);
> +void set_deferred_backend_init(void);
> +void clear_deferred_backend_init(void);
> +
>  #endif
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 4ff6b4d785..e04f9b6469 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -20,6 +20,7 @@
>  #include "sysemu/blockdev.h"
>  #include "sysemu/runstate.h"
>  #include "sysemu/replay.h"
> +#include "sysemu/sysemu.h"
>  #include "qapi/error.h"
>  #include "qapi/qapi-events-block.h"
>  #include "qemu/id.h"
> @@ -935,7 +936,7 @@ int blk_attach_dev(BlockBackend *blk, DeviceState *dev)
>      /* While migration is still incoming, we don't need to apply the
>       * permissions of guest device BlockBackends. We might still have a block
>       * job or NBD server writing to the image for storage migration. */
> -    if (runstate_check(RUN_STATE_INMIGRATE)) {
> +    if (runstate_check(RUN_STATE_INMIGRATE) || deferred_backend_init()) {
>          blk->disable_perm = true;
>      }

Why is this necessary for vfio-user? Disk images shouldn't be in use by
another process so we don't need to bypass permissions temporarily.

>  
> diff --git a/blockdev.c b/blockdev.c
> index 42e098b458..d495070679 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -569,7 +569,7 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
>          qdict_set_default_str(bs_opts, BDRV_OPT_AUTO_READ_ONLY, "on");
>          assert((bdrv_flags & BDRV_O_CACHE_MASK) == 0);
>  
> -        if (runstate_check(RUN_STATE_INMIGRATE)) {
> +        if (runstate_check(RUN_STATE_INMIGRATE) || deferred_backend_init()) {
>              bdrv_flags |= BDRV_O_INACTIVE;

Same here.
Jag Raman March 7, 2022, 3:31 p.m. UTC | #2
> On Mar 7, 2022, at 5:48 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Thu, Feb 17, 2022 at 02:49:03AM -0500, Jagannathan Raman wrote:
>> Allow deferred initialization of backends. TYPE_REMOTE_MACHINE is
>> agnostic to QEMU's RUN_STATE. It's state is driven by the QEMU client
> 
> s/It's/Its/
> 
>> via the vfio-user protocol. Whereas, the backends presently defer
>> initialization if QEMU is in RUN_STATE_INMIGRATE. Since the remote
>> machine can't use RUN_STATE*, this commit allows it to ask for deferred
>> initialization of backend device. It is primarily targeted towards block
>> devices in this commit, but it needed not be limited to that.
> 
> What is the purpose of this commit? I don't understand the description.

Sorry it’s not clear. This patch is needed to support vfio-user migration.

Just for background, this patch along with the next one helps to migrate
individual devices from the source to the destination. For example, in a
storage server daemon with 5 PCI controllers, we could migrate just 2 of
the 5 controllers to the destination while the remaining 3 continue to run
on the source. The destination could also be a server that is already
running, it doesn’t have to be frozen for migration.

This patch specifically affects how block drives are initialized in the
destination. In all the presently defined use cases, QEMU launches the
destination in RUN_STATE_INMIGRATE. This is essentially a frozen
state, which implicitly defers the initialization of the backends such as
block drives until after the migration is complete. Whereas in vfio-user,
the destination cannot be in RUN_STATE_INMIGRATE as it could already
be running. Therefore, we need a way to tell backend devices to defer
their initialization. This patch addresses the need to defer backend
initialization for already running QEMU instances.

> 
>> 
>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>> ---
>> include/sysemu/sysemu.h    |  4 ++++
>> block/block-backend.c      |  3 ++-
>> blockdev.c                 |  2 +-
>> softmmu/vl.c               | 17 +++++++++++++++++
>> stubs/defer-backend-init.c |  7 +++++++
>> MAINTAINERS                |  1 +
>> stubs/meson.build          |  1 +
>> 7 files changed, 33 insertions(+), 2 deletions(-)
>> create mode 100644 stubs/defer-backend-init.c
>> 
>> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
>> index b9421e03ff..3179eb1857 100644
>> --- a/include/sysemu/sysemu.h
>> +++ b/include/sysemu/sysemu.h
>> @@ -119,4 +119,8 @@ extern QemuOptsList qemu_net_opts;
>> extern QemuOptsList qemu_global_opts;
>> extern QemuOptsList qemu_semihosting_config_opts;
>> 
>> +bool deferred_backend_init(void);
>> +void set_deferred_backend_init(void);
>> +void clear_deferred_backend_init(void);
>> +
>> #endif
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index 4ff6b4d785..e04f9b6469 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -20,6 +20,7 @@
>> #include "sysemu/blockdev.h"
>> #include "sysemu/runstate.h"
>> #include "sysemu/replay.h"
>> +#include "sysemu/sysemu.h"
>> #include "qapi/error.h"
>> #include "qapi/qapi-events-block.h"
>> #include "qemu/id.h"
>> @@ -935,7 +936,7 @@ int blk_attach_dev(BlockBackend *blk, DeviceState *dev)
>>     /* While migration is still incoming, we don't need to apply the
>>      * permissions of guest device BlockBackends. We might still have a block
>>      * job or NBD server writing to the image for storage migration. */
>> -    if (runstate_check(RUN_STATE_INMIGRATE)) {
>> +    if (runstate_check(RUN_STATE_INMIGRATE) || deferred_backend_init()) {
>>         blk->disable_perm = true;
>>     }
> 
> Why is this necessary for vfio-user? Disk images shouldn't be in use by
> another process so we don't need to bypass permissions temporarily.

The destination in vfio-user migration needs this - the source would
already be using the disk images.

Thank you!
--
Jag

> 
>> 
>> diff --git a/blockdev.c b/blockdev.c
>> index 42e098b458..d495070679 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -569,7 +569,7 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
>>         qdict_set_default_str(bs_opts, BDRV_OPT_AUTO_READ_ONLY, "on");
>>         assert((bdrv_flags & BDRV_O_CACHE_MASK) == 0);
>> 
>> -        if (runstate_check(RUN_STATE_INMIGRATE)) {
>> +        if (runstate_check(RUN_STATE_INMIGRATE) || deferred_backend_init()) {
>>             bdrv_flags |= BDRV_O_INACTIVE;
> 
> Same here.
diff mbox series

Patch

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index b9421e03ff..3179eb1857 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -119,4 +119,8 @@  extern QemuOptsList qemu_net_opts;
 extern QemuOptsList qemu_global_opts;
 extern QemuOptsList qemu_semihosting_config_opts;
 
+bool deferred_backend_init(void);
+void set_deferred_backend_init(void);
+void clear_deferred_backend_init(void);
+
 #endif
diff --git a/block/block-backend.c b/block/block-backend.c
index 4ff6b4d785..e04f9b6469 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -20,6 +20,7 @@ 
 #include "sysemu/blockdev.h"
 #include "sysemu/runstate.h"
 #include "sysemu/replay.h"
+#include "sysemu/sysemu.h"
 #include "qapi/error.h"
 #include "qapi/qapi-events-block.h"
 #include "qemu/id.h"
@@ -935,7 +936,7 @@  int blk_attach_dev(BlockBackend *blk, DeviceState *dev)
     /* While migration is still incoming, we don't need to apply the
      * permissions of guest device BlockBackends. We might still have a block
      * job or NBD server writing to the image for storage migration. */
-    if (runstate_check(RUN_STATE_INMIGRATE)) {
+    if (runstate_check(RUN_STATE_INMIGRATE) || deferred_backend_init()) {
         blk->disable_perm = true;
     }
 
diff --git a/blockdev.c b/blockdev.c
index 42e098b458..d495070679 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -569,7 +569,7 @@  static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
         qdict_set_default_str(bs_opts, BDRV_OPT_AUTO_READ_ONLY, "on");
         assert((bdrv_flags & BDRV_O_CACHE_MASK) == 0);
 
-        if (runstate_check(RUN_STATE_INMIGRATE)) {
+        if (runstate_check(RUN_STATE_INMIGRATE) || deferred_backend_init()) {
             bdrv_flags |= BDRV_O_INACTIVE;
         }
 
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 5e1b35ba48..9584ab82e3 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -496,6 +496,23 @@  static QemuOptsList qemu_action_opts = {
     },
 };
 
+bool defer_backend_init;
+
+bool deferred_backend_init(void)
+{
+    return defer_backend_init;
+}
+
+void set_deferred_backend_init(void)
+{
+    defer_backend_init = true;
+}
+
+void clear_deferred_backend_init(void)
+{
+    defer_backend_init = false;
+}
+
 const char *qemu_get_vm_name(void)
 {
     return qemu_name;
diff --git a/stubs/defer-backend-init.c b/stubs/defer-backend-init.c
new file mode 100644
index 0000000000..3a74c669a1
--- /dev/null
+++ b/stubs/defer-backend-init.c
@@ -0,0 +1,7 @@ 
+#include "qemu/osdep.h"
+#include "sysemu/sysemu.h"
+
+bool deferred_backend_init(void)
+{
+    return false;
+}
diff --git a/MAINTAINERS b/MAINTAINERS
index e274cb46af..1f55d04ce6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3572,6 +3572,7 @@  F: hw/remote/vfio-user-obj.c
 F: include/hw/remote/vfio-user-obj.h
 F: hw/remote/iommu.c
 F: include/hw/remote/iommu.h
+F: stubs/defer-backend-init.c
 
 EBPF:
 M: Jason Wang <jasowang@redhat.com>
diff --git a/stubs/meson.build b/stubs/meson.build
index c5ce979dc3..98770966f6 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -58,3 +58,4 @@  else
   stub_ss.add(files('qdev.c'))
 endif
 stub_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_false: files('vfio-user-obj.c'))
+stub_ss.add(files('defer-backend-init.c'))