Message ID | 20230706131418.423713-1-f.ebner@proxmox.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | qemu_cleanup: begin drained section after vm_shutdown() | expand |
Queued, thanks. Paolo
Am 06.07.2023 um 16:43 hat Paolo Bonzini geschrieben:
> Queued, thanks.
This patch broke qemu-iotests 109 (for raw images), some block jobs get
now paused once more. This is probably okay, but please double check and
fix either the reference output or the code.
Kevin
Am 14.08.2023 um 15:53 hat Kevin Wolf geschrieben: > Am 06.07.2023 um 16:43 hat Paolo Bonzini geschrieben: > > Queued, thanks. > > This patch broke qemu-iotests 109 (for raw images), some block jobs get > now paused once more. This is probably okay, but please double check and > fix either the reference output or the code. It actually also broke 185. I'm less sure if not going through "aborting" is completely harmless. Can you please check why this happens? Kevin --- /home/kwolf/source/qemu/tests/qemu-iotests/185.out +++ /home/kwolf/source/qemu/build-clang/scratch/qcow2-file-185/185.out.bad @@ -137,7 +137,7 @@ {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "standby", "id": "mirror"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "mirror"}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "mirror"}} +{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "standby", "id": "mirror"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "mirror", "len": 33554432, "offset": (filtered), "speed": 0, "type": "mirror"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "mirror"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "mirror"}} @@ -160,7 +160,7 @@ {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "standby", "id": "commit"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "commit"}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "commit"}} +{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "standby", "id": "commit"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "commit", "len": 33554432, "offset": (filtered), "speed": 0, "type": "commit"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "commit"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "commit"}}
Am 14.08.23 um 15:53 schrieb Kevin Wolf: > Am 06.07.2023 um 16:43 hat Paolo Bonzini geschrieben: >> Queued, thanks. > > This patch broke qemu-iotests 109 (for raw images), some block jobs get > now paused once more. This is probably okay, but please double check and > fix either the reference output or the code. > Sorry about that! AFAICT, the additional pause happens, because the bdrv_drain_all() call in do_vm_stop() is now not inside the drained section used by qemu_cleanup() anymore. I.e., there is a second drained section now that ends before the final one in qemu_cleanup() starts and thus job_pause() is called twice via child_job_drained_begin(). I'll try to wrap my head around 185 too. Best Regards, Fiona
diff --git a/softmmu/runstate.c b/softmmu/runstate.c index a9fbcf4862..f3bd862818 100644 --- a/softmmu/runstate.c +++ b/softmmu/runstate.c @@ -802,21 +802,21 @@ void qemu_cleanup(void) */ blk_exp_close_all(); + + /* No more vcpu or device emulation activity beyond this point */ + vm_shutdown(); + replay_finish(); + /* * We must cancel all block jobs while the block layer is drained, * or cancelling will be affected by throttling and thus may block * for an extended period of time. - * vm_shutdown() will bdrv_drain_all(), so we may as well include - * it in the drained section. + * Begin the drained section after vm_shutdown() to avoid requests being + * stuck in the BlockBackend's request queue. * We do not need to end this section, because we do not want any * requests happening from here on anyway. */ bdrv_drain_all_begin(); - - /* No more vcpu or device emulation activity beyond this point */ - vm_shutdown(); - replay_finish(); - job_cancel_sync_all(); bdrv_close_all();