[v3,0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup

Message ID	20191112113012.71136-1-slp@redhat.com (mailing list archive)
Headers	show Return-Path: <SRS0=Y7m6=ZE=nongnu.org=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6C81E21925 From: Sergio Lopez <slp@redhat.com> To: qemu-devel@nongnu.org Subject: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Date: Tue, 12 Nov 2019 12:30:04 +0100 Message-Id: <20191112113012.71136-1-slp@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Precedence: list Cc: Kevin Wolf <kwolf@redhat.com>, Sergio Lopez <slp@redhat.com>, Markus Armbruster <armbru@redhat.com>, qemu-block@nongnu.org, Max Reitz <mreitz@redhat.com> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Series	blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup \| expand [v3,0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup [v3,2/8] blockdev: fix coding style issues in drive_backup_prepare [v3,3/8] blockdev: place drive_backup_prepare with the other related transaction functions [v3,4/8] blockdev: change qmp_drive_backup to make use of transactions [v3,5/8] blockdev: merge blockdev_backup_prepare with do_blockdev_backup [v3,6/8] blockdev: place blockdev_backup_prepare with the other related transaction helpers [v3,7/8] blockdev: change qmp_blockdev_backup to make use of transactions [v3,8/8] blockdev: honor bdrv_try_set_aio_context() context requirements

Sergio Lopez Nov. 12, 2019, 11:30 a.m. UTC

do_drive_backup() acquires the AioContext lock of the corresponding
BlockDriverState. This is not a problem when it's called from
qmp_drive_backup(), but drive_backup_prepare() also acquires the lock
before calling it. The same things happens with do_blockdev_backup()
and blockdev_backup_prepare().

This patch series merges do_drive_backup() with drive_backup_prepare()
and do_blockdev_backup() with blockdev_backup_prepare(), and ensures
they're only getting called from a transaction context. This way,
there's a single code path for both transaction requests and qmp
commands, as suggested by Kevin Wolf.

We also take this opportunity to ensure we're honoring the context
acquisition semantics required by bdrv_try_set_aio_context, as
suggested by Max Reitz.

---
Changelog:

v3:
 - Rework the whole patch series to fix the issue by consolidating all
   operations in the transaction model. (thanks Kevin Wolf)

v2:
 - Honor bdrv_try_set_aio_context() context acquisition requirements
   (thanks Max Reitz).
 - Release the context at drive_backup_prepare() instead of avoiding
   re-acquiring it at do_drive_baclup(). (thanks Max Reitz)
 - Convert a single patch into a two-patch series.
---

Sergio Lopez (8):
  blockdev: merge drive_backup_prepare with do_drive_backup
  blockdev: fix coding style issues in drive_backup_prepare
  blockdev: place drive_backup_prepare with the other related
    transaction functions
  blockdev: change qmp_drive_backup to make use of transactions
  blockdev: merge blockdev_backup_prepare with do_blockdev_backup
  blockdev: place blockdev_backup_prepare with the other related
    transaction helpers
  blockdev: change qmp_blockdev_backup to make use of transactions
  blockdev: honor bdrv_try_set_aio_context() context requirements

 blockdev.c | 349 ++++++++++++++++++++++++++---------------------------
 1 file changed, 171 insertions(+), 178 deletions(-)

no-reply@patchew.org Nov. 12, 2019, 10:49 p.m. UTC | #1

Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  TEST    iotest-qcow2: 268
Failures: 141
Failed 1 of 108 iotests
make: *** [check-tests/check-block.sh] Error 1
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 662, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-6a9_8q0n/src/docker-src.2019-11-12-17.38.46.26027:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-6a9_8q0n/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    10m57.839s
user    0m8.062s


The full log is available at
http://patchew.org/logs/20191112113012.71136-1-slp@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

Sergio Lopez Nov. 13, 2019, 9:14 a.m. UTC | #2

no-reply@patchew.org writes:

> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>
>
>
> Hi,
>
> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> their output below. If you have Docker installed, you can probably reproduce it
> locally.
>
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> make docker-image-centos7 V=1 NETWORK=1
> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
> === TEST SCRIPT END ===
>
>   TEST    iotest-qcow2: 268
> Failures: 141

Hm... 141 didn't fail in my test machine. I'm going to have a look.

Sergio.

> Failed 1 of 108 iotests
> make: *** [check-tests/check-block.sh] Error 1
> Traceback (most recent call last):
>   File "./tests/docker/docker.py", line 662, in <module>
>     sys.exit(main())
> ---
>     raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-6a9_8q0n/src/docker-src.2019-11-12-17.38.46.26027:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
> filter=--filter=label=com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417
> make[1]: *** [docker-run] Error 1
> make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-6a9_8q0n/src'
> make: *** [docker-run-test-quick@centos7] Error 2
>
> real    10m57.839s
> user    0m8.062s
>
>
> The full log is available at
> http://patchew.org/logs/20191112113012.71136-1-slp@redhat.com/testing.docker-quick@centos7/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-devel@redhat.com

Sergio Lopez Nov. 13, 2019, 1:24 p.m. UTC | #3

Sergio Lopez <slp@redhat.com> writes:

> no-reply@patchew.org writes:
>
>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>>
>>
>>
>> Hi,
>>
>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
>> their output below. If you have Docker installed, you can probably reproduce it
>> locally.
>>
>> === TEST SCRIPT BEGIN ===
>> #!/bin/bash
>> make docker-image-centos7 V=1 NETWORK=1
>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>> === TEST SCRIPT END ===
>>
>>   TEST    iotest-qcow2: 268
>> Failures: 141
>
> Hm... 141 didn't fail in my test machine. I'm going to have a look.

So here's the output:

--- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
+++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
@@ -10,6 +10,8 @@
 Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
 {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}

Those extra lines, the "paused" and "running", are a result of the job
being done in a transaction, within a drained section.

We can update 141.out, but now I'm wondering, was it safe creating the
job at do_drive_backup() outside of a drained section, as
qmp_drive_backup was doing?

Do you think there may be any potential drawbacks as a result of always
doing it now inside a drained section?

Thanks,
Sergio.

> Sergio.
>
>> Failed 1 of 108 iotests
>> make: *** [check-tests/check-block.sh] Error 1
>> Traceback (most recent call last):
>>   File "./tests/docker/docker.py", line 662, in <module>
>>     sys.exit(main())
>> ---
>>     raise CalledProcessError(retcode, cmd)
>> subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-6a9_8q0n/src/docker-src.2019-11-12-17.38.46.26027:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
>> filter=--filter=label=com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417
>> make[1]: *** [docker-run] Error 1
>> make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-6a9_8q0n/src'
>> make: *** [docker-run-test-quick@centos7] Error 2
>>
>> real    10m57.839s
>> user    0m8.062s
>>
>>
>> The full log is available at
>> http://patchew.org/logs/20191112113012.71136-1-slp@redhat.com/testing.docker-quick@centos7/?type=message.
>> ---
>> Email generated automatically by Patchew [https://patchew.org/].
>> Please send your feedback to patchew-devel@redhat.com

Max Reitz Nov. 19, 2019, 9:36 a.m. UTC | #4

On 13.11.19 14:24, Sergio Lopez wrote:
> 
> Sergio Lopez <slp@redhat.com> writes:
> 
>> no-reply@patchew.org writes:
>>
>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>>>
>>>
>>>
>>> Hi,
>>>
>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
>>> their output below. If you have Docker installed, you can probably reproduce it
>>> locally.
>>>
>>> === TEST SCRIPT BEGIN ===
>>> #!/bin/bash
>>> make docker-image-centos7 V=1 NETWORK=1
>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>>> === TEST SCRIPT END ===
>>>
>>>   TEST    iotest-qcow2: 268
>>> Failures: 141
>>
>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
> 
> So here's the output:
> 
> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
> @@ -10,6 +10,8 @@
>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
>  {"return": {}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
> 
> Those extra lines, the "paused" and "running", are a result of the job
> being done in a transaction, within a drained section.
> 
> We can update 141.out, but now I'm wondering, was it safe creating the
> job at do_drive_backup() outside of a drained section, as
> qmp_drive_backup was doing?

I think it is.  Someone needs to drain the source node before attaching
the job filter (which intercepts writes), and bdrv_backup_top_append()
does precisely this.

If the source node is in an I/O thread, you could argue that the drain
starts later than when the user has invoked the backup command, and so
some writes might slip through.  That’s correct.  But at the same time,
it’s impossible to drain it the instant the command is received.  So
some writes might always slip through (and the drain will not stop them
either, it will just let them happen).

Therefore, I think it’s fine the way it is.

> Do you think there may be any potential drawbacks as a result of always
> doing it now inside a drained section?

Well, one drawback is clearly visible.  The job goes to paused for no
reason.

Max

Sergio Lopez Nov. 19, 2019, 10:54 a.m. UTC | #5

Max Reitz <mreitz@redhat.com> writes:

> On 13.11.19 14:24, Sergio Lopez wrote:
>> 
>> Sergio Lopez <slp@redhat.com> writes:
>> 
>>> no-reply@patchew.org writes:
>>>
>>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
>>>> their output below. If you have Docker installed, you can probably reproduce it
>>>> locally.
>>>>
>>>> === TEST SCRIPT BEGIN ===
>>>> #!/bin/bash
>>>> make docker-image-centos7 V=1 NETWORK=1
>>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>>>> === TEST SCRIPT END ===
>>>>
>>>>   TEST    iotest-qcow2: 268
>>>> Failures: 141
>>>
>>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
>> 
>> So here's the output:
>> 
>> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
>> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
>> @@ -10,6 +10,8 @@
>>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
>>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
>>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
>>  {"return": {}}
>>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
>> 
>> Those extra lines, the "paused" and "running", are a result of the job
>> being done in a transaction, within a drained section.
>> 
>> We can update 141.out, but now I'm wondering, was it safe creating the
>> job at do_drive_backup() outside of a drained section, as
>> qmp_drive_backup was doing?
>
> I think it is.  Someone needs to drain the source node before attaching
> the job filter (which intercepts writes), and bdrv_backup_top_append()
> does precisely this.
>
> If the source node is in an I/O thread, you could argue that the drain
> starts later than when the user has invoked the backup command, and so
> some writes might slip through.  That’s correct.  But at the same time,
> it’s impossible to drain it the instant the command is received.  So
> some writes might always slip through (and the drain will not stop them
> either, it will just let them happen).
>
> Therefore, I think it’s fine the way it is.
>
>> Do you think there may be any potential drawbacks as a result of always
>> doing it now inside a drained section?
>
> Well, one drawback is clearly visible.  The job goes to paused for no
> reason.

This is something that already happens when requesting the drive-backup
through a transaction:

{"execute":"transaction","arguments":{"actions":[{"type":"drive-backup","data":{"device":"drv0","target":"o.qcow2","sync":"full","format":"qcow2"}}]}}

I don't think it makes sense to have two different behaviors for the
same action. So we either accept the additional pause+resume iteration
for qmp_drive_backup, or we remove the drained section from the
transaction based one.

What do you think?

Cheers,
Sergio.

Kevin Wolf Nov. 19, 2019, 11:18 a.m. UTC | #6

Am 19.11.2019 um 11:54 hat Sergio Lopez geschrieben:
> 
> Max Reitz <mreitz@redhat.com> writes:
> 
> > On 13.11.19 14:24, Sergio Lopez wrote:
> >> 
> >> Sergio Lopez <slp@redhat.com> writes:
> >> 
> >>> no-reply@patchew.org writes:
> >>>
> >>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
> >>>>
> >>>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> >>>> their output below. If you have Docker installed, you can probably reproduce it
> >>>> locally.
> >>>>
> >>>> === TEST SCRIPT BEGIN ===
> >>>> #!/bin/bash
> >>>> make docker-image-centos7 V=1 NETWORK=1
> >>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
> >>>> === TEST SCRIPT END ===
> >>>>
> >>>>   TEST    iotest-qcow2: 268
> >>>> Failures: 141
> >>>
> >>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
> >> 
> >> So here's the output:
> >> 
> >> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
> >> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
> >> @@ -10,6 +10,8 @@
> >>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> >>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
> >>  {"return": {}}
> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
> >> 
> >> Those extra lines, the "paused" and "running", are a result of the job
> >> being done in a transaction, within a drained section.
> >> 
> >> We can update 141.out, but now I'm wondering, was it safe creating the
> >> job at do_drive_backup() outside of a drained section, as
> >> qmp_drive_backup was doing?
> >
> > I think it is.  Someone needs to drain the source node before attaching
> > the job filter (which intercepts writes), and bdrv_backup_top_append()
> > does precisely this.
> >
> > If the source node is in an I/O thread, you could argue that the drain
> > starts later than when the user has invoked the backup command, and so
> > some writes might slip through.  That’s correct.  But at the same time,
> > it’s impossible to drain it the instant the command is received.  So
> > some writes might always slip through (and the drain will not stop them
> > either, it will just let them happen).
> >
> > Therefore, I think it’s fine the way it is.
> >
> >> Do you think there may be any potential drawbacks as a result of always
> >> doing it now inside a drained section?
> >
> > Well, one drawback is clearly visible.  The job goes to paused for no
> > reason.
> 
> This is something that already happens when requesting the drive-backup
> through a transaction:
> 
> {"execute":"transaction","arguments":{"actions":[{"type":"drive-backup","data":{"device":"drv0","target":"o.qcow2","sync":"full","format":"qcow2"}}]}}
> 
> I don't think it makes sense to have two different behaviors for the
> same action. So we either accept the additional pause+resume iteration
> for qmp_drive_backup, or we remove the drained section from the
> transaction based one.
> 
> What do you think?

Draining all involved nodes is necessary for transactions, because you
want a consistent backup across all involved disks. That is, you want it
to be a snapshot at the same point in time for all of them - no requests
may happen between starting backup on the first and the second disk.

For a single device operation, this requirement doesn't exist, because
there is nothing else that must happen at the same point in time.

Kevin

Sergio Lopez Nov. 19, 2019, 11:35 a.m. UTC | #7

Kevin Wolf <kwolf@redhat.com> writes:

> Am 19.11.2019 um 11:54 hat Sergio Lopez geschrieben:
>> 
>> Max Reitz <mreitz@redhat.com> writes:
>> 
>> > On 13.11.19 14:24, Sergio Lopez wrote:
>> >> 
>> >> Sergio Lopez <slp@redhat.com> writes:
>> >> 
>> >>> no-reply@patchew.org writes:
>> >>>
>> >>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>> >>>>
>> >>>>
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
>> >>>> their output below. If you have Docker installed, you can probably reproduce it
>> >>>> locally.
>> >>>>
>> >>>> === TEST SCRIPT BEGIN ===
>> >>>> #!/bin/bash
>> >>>> make docker-image-centos7 V=1 NETWORK=1
>> >>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>> >>>> === TEST SCRIPT END ===
>> >>>>
>> >>>>   TEST    iotest-qcow2: 268
>> >>>> Failures: 141
>> >>>
>> >>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
>> >> 
>> >> So here's the output:
>> >> 
>> >> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
>> >> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
>> >> @@ -10,6 +10,8 @@
>> >>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
>> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
>> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
>> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> >>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
>> >>  {"return": {}}
>> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
>> >> 
>> >> Those extra lines, the "paused" and "running", are a result of the job
>> >> being done in a transaction, within a drained section.
>> >> 
>> >> We can update 141.out, but now I'm wondering, was it safe creating the
>> >> job at do_drive_backup() outside of a drained section, as
>> >> qmp_drive_backup was doing?
>> >
>> > I think it is.  Someone needs to drain the source node before attaching
>> > the job filter (which intercepts writes), and bdrv_backup_top_append()
>> > does precisely this.
>> >
>> > If the source node is in an I/O thread, you could argue that the drain
>> > starts later than when the user has invoked the backup command, and so
>> > some writes might slip through.  That’s correct.  But at the same time,
>> > it’s impossible to drain it the instant the command is received.  So
>> > some writes might always slip through (and the drain will not stop them
>> > either, it will just let them happen).
>> >
>> > Therefore, I think it’s fine the way it is.
>> >
>> >> Do you think there may be any potential drawbacks as a result of always
>> >> doing it now inside a drained section?
>> >
>> > Well, one drawback is clearly visible.  The job goes to paused for no
>> > reason.
>> 
>> This is something that already happens when requesting the drive-backup
>> through a transaction:
>> 
>> {"execute":"transaction","arguments":{"actions":[{"type":"drive-backup","data":{"device":"drv0","target":"o.qcow2","sync":"full","format":"qcow2"}}]}}
>> 
>> I don't think it makes sense to have two different behaviors for the
>> same action. So we either accept the additional pause+resume iteration
>> for qmp_drive_backup, or we remove the drained section from the
>> transaction based one.
>> 
>> What do you think?
>
> Draining all involved nodes is necessary for transactions, because you
> want a consistent backup across all involved disks. That is, you want it
> to be a snapshot at the same point in time for all of them - no requests
> may happen between starting backup on the first and the second disk.
>
> For a single device operation, this requirement doesn't exist, because
> there is nothing else that must happen at the same point in time.

This poses a problem with the unification strategy you suggested for qmp
commands and transactions. I guess that, if we really want to preserve
the original behavior, we can extend DriveBackup to add a flag to
indicate whether the transaction should create a drained section or not.

Does this sound reasonable to you?

Thanks,
Sergio.

Kevin Wolf Nov. 19, 2019, 12:13 p.m. UTC | #8

Am 19.11.2019 um 12:35 hat Sergio Lopez geschrieben:
> 
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 19.11.2019 um 11:54 hat Sergio Lopez geschrieben:
> >> 
> >> Max Reitz <mreitz@redhat.com> writes:
> >> 
> >> > On 13.11.19 14:24, Sergio Lopez wrote:
> >> >> 
> >> >> Sergio Lopez <slp@redhat.com> writes:
> >> >> 
> >> >>> no-reply@patchew.org writes:
> >> >>>
> >> >>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> Hi,
> >> >>>>
> >> >>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> >> >>>> their output below. If you have Docker installed, you can probably reproduce it
> >> >>>> locally.
> >> >>>>
> >> >>>> === TEST SCRIPT BEGIN ===
> >> >>>> #!/bin/bash
> >> >>>> make docker-image-centos7 V=1 NETWORK=1
> >> >>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
> >> >>>> === TEST SCRIPT END ===
> >> >>>>
> >> >>>>   TEST    iotest-qcow2: 268
> >> >>>> Failures: 141
> >> >>>
> >> >>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
> >> >> 
> >> >> So here's the output:
> >> >> 
> >> >> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
> >> >> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
> >> >> @@ -10,6 +10,8 @@
> >> >>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> >> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
> >> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> >> >>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
> >> >>  {"return": {}}
> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
> >> >> 
> >> >> Those extra lines, the "paused" and "running", are a result of the job
> >> >> being done in a transaction, within a drained section.
> >> >> 
> >> >> We can update 141.out, but now I'm wondering, was it safe creating the
> >> >> job at do_drive_backup() outside of a drained section, as
> >> >> qmp_drive_backup was doing?
> >> >
> >> > I think it is.  Someone needs to drain the source node before attaching
> >> > the job filter (which intercepts writes), and bdrv_backup_top_append()
> >> > does precisely this.
> >> >
> >> > If the source node is in an I/O thread, you could argue that the drain
> >> > starts later than when the user has invoked the backup command, and so
> >> > some writes might slip through.  That’s correct.  But at the same time,
> >> > it’s impossible to drain it the instant the command is received.  So
> >> > some writes might always slip through (and the drain will not stop them
> >> > either, it will just let them happen).
> >> >
> >> > Therefore, I think it’s fine the way it is.
> >> >
> >> >> Do you think there may be any potential drawbacks as a result of always
> >> >> doing it now inside a drained section?
> >> >
> >> > Well, one drawback is clearly visible.  The job goes to paused for no
> >> > reason.
> >> 
> >> This is something that already happens when requesting the drive-backup
> >> through a transaction:
> >> 
> >> {"execute":"transaction","arguments":{"actions":[{"type":"drive-backup","data":{"device":"drv0","target":"o.qcow2","sync":"full","format":"qcow2"}}]}}
> >> 
> >> I don't think it makes sense to have two different behaviors for the
> >> same action. So we either accept the additional pause+resume iteration
> >> for qmp_drive_backup, or we remove the drained section from the
> >> transaction based one.
> >> 
> >> What do you think?
> >
> > Draining all involved nodes is necessary for transactions, because you
> > want a consistent backup across all involved disks. That is, you want it
> > to be a snapshot at the same point in time for all of them - no requests
> > may happen between starting backup on the first and the second disk.
> >
> > For a single device operation, this requirement doesn't exist, because
> > there is nothing else that must happen at the same point in time.
> 
> This poses a problem with the unification strategy you suggested for qmp
> commands and transactions. I guess that, if we really want to preserve
> the original behavior, we can extend DriveBackup to add a flag to
> indicate whether the transaction should create a drained section or not.
> 
> Does this sound reasonable to you?

I think we can accept an unnecessary drain for the single-device case.
It's only minimally worse than not draining early (because, as Max said,
we'll drain the node anyway later).

I'm not sure what the code looks like, but does the job go to paused
even when it's already created inside the drained section? (As opposed
to first creating the job and then draining.) I assume that this is what
you're already doing, just double-checking.

If this is how things work, I'd just adjust the test output and explain
the change in the commit message.

Kevin

Sergio Lopez Nov. 19, 2019, 12:31 p.m. UTC | #9

Kevin Wolf <kwolf@redhat.com> writes:

> Am 19.11.2019 um 12:35 hat Sergio Lopez geschrieben:
>> 
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 19.11.2019 um 11:54 hat Sergio Lopez geschrieben:
>> >> 
>> >> Max Reitz <mreitz@redhat.com> writes:
>> >> 
>> >> > On 13.11.19 14:24, Sergio Lopez wrote:
>> >> >> 
>> >> >> Sergio Lopez <slp@redhat.com> writes:
>> >> >> 
>> >> >>> no-reply@patchew.org writes:
>> >> >>>
>> >> >>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> Hi,
>> >> >>>>
>> >> >>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
>> >> >>>> their output below. If you have Docker installed, you can probably reproduce it
>> >> >>>> locally.
>> >> >>>>
>> >> >>>> === TEST SCRIPT BEGIN ===
>> >> >>>> #!/bin/bash
>> >> >>>> make docker-image-centos7 V=1 NETWORK=1
>> >> >>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>> >> >>>> === TEST SCRIPT END ===
>> >> >>>>
>> >> >>>>   TEST    iotest-qcow2: 268
>> >> >>>> Failures: 141
>> >> >>>
>> >> >>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
>> >> >> 
>> >> >> So here's the output:
>> >> >> 
>> >> >> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
>> >> >> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
>> >> >> @@ -10,6 +10,8 @@
>> >> >>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
>> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
>> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> >> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
>> >> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> >> >>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
>> >> >>  {"return": {}}
>> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
>> >> >> 
>> >> >> Those extra lines, the "paused" and "running", are a result of the job
>> >> >> being done in a transaction, within a drained section.
>> >> >> 
>> >> >> We can update 141.out, but now I'm wondering, was it safe creating the
>> >> >> job at do_drive_backup() outside of a drained section, as
>> >> >> qmp_drive_backup was doing?
>> >> >
>> >> > I think it is.  Someone needs to drain the source node before attaching
>> >> > the job filter (which intercepts writes), and bdrv_backup_top_append()
>> >> > does precisely this.
>> >> >
>> >> > If the source node is in an I/O thread, you could argue that the drain
>> >> > starts later than when the user has invoked the backup command, and so
>> >> > some writes might slip through.  That’s correct.  But at the same time,
>> >> > it’s impossible to drain it the instant the command is received.  So
>> >> > some writes might always slip through (and the drain will not stop them
>> >> > either, it will just let them happen).
>> >> >
>> >> > Therefore, I think it’s fine the way it is.
>> >> >
>> >> >> Do you think there may be any potential drawbacks as a result of always
>> >> >> doing it now inside a drained section?
>> >> >
>> >> > Well, one drawback is clearly visible.  The job goes to paused for no
>> >> > reason.
>> >> 
>> >> This is something that already happens when requesting the drive-backup
>> >> through a transaction:
>> >> 
>> >> {"execute":"transaction","arguments":{"actions":[{"type":"drive-backup","data":{"device":"drv0","target":"o.qcow2","sync":"full","format":"qcow2"}}]}}
>> >> 
>> >> I don't think it makes sense to have two different behaviors for the
>> >> same action. So we either accept the additional pause+resume iteration
>> >> for qmp_drive_backup, or we remove the drained section from the
>> >> transaction based one.
>> >> 
>> >> What do you think?
>> >
>> > Draining all involved nodes is necessary for transactions, because you
>> > want a consistent backup across all involved disks. That is, you want it
>> > to be a snapshot at the same point in time for all of them - no requests
>> > may happen between starting backup on the first and the second disk.
>> >
>> > For a single device operation, this requirement doesn't exist, because
>> > there is nothing else that must happen at the same point in time.
>> 
>> This poses a problem with the unification strategy you suggested for qmp
>> commands and transactions. I guess that, if we really want to preserve
>> the original behavior, we can extend DriveBackup to add a flag to
>> indicate whether the transaction should create a drained section or not.
>> 
>> Does this sound reasonable to you?
>
> I think we can accept an unnecessary drain for the single-device case.
> It's only minimally worse than not draining early (because, as Max said,
> we'll drain the node anyway later).
>
> I'm not sure what the code looks like, but does the job go to paused
> even when it's already created inside the drained section? (As opposed
> to first creating the job and then draining.) I assume that this is what
> you're already doing, just double-checking.

Yes, that's the case. drive_backup_prepare() calls to
bdrv_drained_begin() first, and then to do_backup_common(), which creates
the job.

> If this is how things work, I'd just adjust the test output and explain
> the change in the commit message.

OK, I'll prepare a v4 with a rework of the patchset and an update to the
job.

Thanks,
Sergio.

[v3,0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup

Message

Comments