Message ID | 20220215135727.28521-4-hreitz@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block: Make bdrv_refresh_limits() non-recursive | expand |
On Tue, Feb 15, 2022 at 02:57:27PM +0100, Hanna Reitz wrote: > Test the following scenario: > 1. Some block node (null-co) attached to a user (here: NBD server) that > performs I/O and keeps the node in an I/O thread > 2. Repeatedly run blockdev-add/blockdev-del to add/remove an overlay > to/from that node > > Each blockdev-add triggers bdrv_refresh_limits(), and because > blockdev-add runs in the main thread, it does not stop the I/O requests. > I/O can thus happen while the limits are refreshed, and when such a > request sees a temporarily invalid block limit (e.g. alignment is 0), > this may easily crash qemu (or the storage daemon in this case). > > The block layer needs to ensure that I/O requests to a node are paused > while that node's BlockLimits are refreshed. > > Signed-off-by: Hanna Reitz <hreitz@redhat.com> > --- > .../qemu-iotests/tests/graph-changes-while-io | 91 +++++++++++++++++++ > .../tests/graph-changes-while-io.out | 5 + > 2 files changed, 96 insertions(+) > create mode 100755 tests/qemu-iotests/tests/graph-changes-while-io > create mode 100644 tests/qemu-iotests/tests/graph-changes-while-io.out Reviewed-by: Eric Blake <eblake@redhat.com> Since we found this with the help of NBD, should I be considering this series for my NBD queue, or is there a better block-related maintainer queue that it should go through?
On 15.02.22 23:22, Eric Blake wrote: > On Tue, Feb 15, 2022 at 02:57:27PM +0100, Hanna Reitz wrote: >> Test the following scenario: >> 1. Some block node (null-co) attached to a user (here: NBD server) that >> performs I/O and keeps the node in an I/O thread >> 2. Repeatedly run blockdev-add/blockdev-del to add/remove an overlay >> to/from that node >> >> Each blockdev-add triggers bdrv_refresh_limits(), and because >> blockdev-add runs in the main thread, it does not stop the I/O requests. >> I/O can thus happen while the limits are refreshed, and when such a >> request sees a temporarily invalid block limit (e.g. alignment is 0), >> this may easily crash qemu (or the storage daemon in this case). >> >> The block layer needs to ensure that I/O requests to a node are paused >> while that node's BlockLimits are refreshed. >> >> Signed-off-by: Hanna Reitz <hreitz@redhat.com> >> --- >> .../qemu-iotests/tests/graph-changes-while-io | 91 +++++++++++++++++++ >> .../tests/graph-changes-while-io.out | 5 + >> 2 files changed, 96 insertions(+) >> create mode 100755 tests/qemu-iotests/tests/graph-changes-while-io >> create mode 100644 tests/qemu-iotests/tests/graph-changes-while-io.out > Reviewed-by: Eric Blake <eblake@redhat.com> > > Since we found this with the help of NBD, should I be considering this > series for my NBD queue, or is there a better block-related maintainer > queue that it should go through? Well, we found it by using a guest, it’s just that using a guest in the iotests is not quite that great, so we need some other way to induce I/O (concurrently to monitor commands). I could’ve used FUSE, too, but NBD is always compiled in, so. :) In any case, of course I don’t mind who takes this series. If you want to take it, go ahead (and thanks!) – I’ll be sending a v2 to split the `try` block in patch 2, though. Hanna
diff --git a/tests/qemu-iotests/tests/graph-changes-while-io b/tests/qemu-iotests/tests/graph-changes-while-io new file mode 100755 index 0000000000..567e8cf21e --- /dev/null +++ b/tests/qemu-iotests/tests/graph-changes-while-io @@ -0,0 +1,91 @@ +#!/usr/bin/env python3 +# group: rw +# +# Test graph changes while I/O is happening +# +# Copyright (C) 2022 Red Hat, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. +# + +import os +from threading import Thread +import iotests +from iotests import imgfmt, qemu_img, qemu_img_create, QMPTestCase, \ + QemuStorageDaemon + + +top = os.path.join(iotests.test_dir, 'top.img') +nbd_sock = os.path.join(iotests.sock_dir, 'nbd.sock') + + +def do_qemu_img_bench() -> None: + """ + Do some I/O requests on `nbd_sock`. + """ + assert qemu_img('bench', '-f', 'raw', '-c', '2000000', + f'nbd+unix:///node0?socket={nbd_sock}') == 0 + + +class TestGraphChangesWhileIO(QMPTestCase): + def setUp(self) -> None: + # Create an overlay that can be added at runtime on top of the + # null-co block node that will receive I/O + assert qemu_img_create('-f', imgfmt, '-F', 'raw', '-b', 'null-co://', + top) == 0 + + # QSD instance with a null-co block node in an I/O thread, + # exported over NBD (on `nbd_sock`, export name "node0") + self.qsd = QemuStorageDaemon( + '--object', 'iothread,id=iothread0', + '--blockdev', 'null-co,node-name=node0,read-zeroes=true', + '--nbd-server', f'addr.type=unix,addr.path={nbd_sock}', + '--export', 'nbd,id=exp0,node-name=node0,iothread=iothread0,' + + 'fixed-iothread=true,writable=true', + qmp=True + ) + + def tearDown(self) -> None: + self.qsd.stop() + + def test_blockdev_add_while_io(self) -> None: + # Run qemu-img bench in the background + bench_thr = Thread(target=do_qemu_img_bench) + bench_thr.start() + + # While qemu-img bench is running, repeatedly add and remove an + # overlay to/from node0 + while bench_thr.is_alive(): + result = self.qsd.qmp('blockdev-add', { + 'driver': imgfmt, + 'node-name': 'overlay', + 'backing': 'node0', + 'file': { + 'driver': 'file', + 'filename': top + } + }) + self.assert_qmp(result, 'return', {}) + + result = self.qsd.qmp('blockdev-del', { + 'node-name': 'overlay' + }) + self.assert_qmp(result, 'return', {}) + + bench_thr.join() + +if __name__ == '__main__': + # Format must support raw backing files + iotests.main(supported_fmts=['qcow', 'qcow2', 'qed'], + supported_protocols=['file']) diff --git a/tests/qemu-iotests/tests/graph-changes-while-io.out b/tests/qemu-iotests/tests/graph-changes-while-io.out new file mode 100644 index 0000000000..ae1213e6f8 --- /dev/null +++ b/tests/qemu-iotests/tests/graph-changes-while-io.out @@ -0,0 +1,5 @@ +. +---------------------------------------------------------------------- +Ran 1 tests + +OK
Test the following scenario: 1. Some block node (null-co) attached to a user (here: NBD server) that performs I/O and keeps the node in an I/O thread 2. Repeatedly run blockdev-add/blockdev-del to add/remove an overlay to/from that node Each blockdev-add triggers bdrv_refresh_limits(), and because blockdev-add runs in the main thread, it does not stop the I/O requests. I/O can thus happen while the limits are refreshed, and when such a request sees a temporarily invalid block limit (e.g. alignment is 0), this may easily crash qemu (or the storage daemon in this case). The block layer needs to ensure that I/O requests to a node are paused while that node's BlockLimits are refreshed. Signed-off-by: Hanna Reitz <hreitz@redhat.com> --- .../qemu-iotests/tests/graph-changes-while-io | 91 +++++++++++++++++++ .../tests/graph-changes-while-io.out | 5 + 2 files changed, 96 insertions(+) create mode 100755 tests/qemu-iotests/tests/graph-changes-while-io create mode 100644 tests/qemu-iotests/tests/graph-changes-while-io.out