Message ID | dd350838-9132-e787-10b1-4cb200076ec9@virtuozzo.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue 07 Nov 2017 05:19:41 PM CET, Anton Nefedov wrote: > One more drainage-related thing. > We have recently encountered an issue with parallel block jobs and it's > not quite clear how to fix it properly, so would appreciate your ideas. > > The small attached tweak makes iotest 30 (-qcow2 -nocache) fail 10/10 > times on my machine. I just wanted to quickly confirm that I can reproduce this in my system too, I just checked QEMU v2.8 (when this was introduced) and it was affected as well. Somehow we didn't detect this back in the day :( I'll take a look. Berto
On Tue 07 Nov 2017 05:19:41 PM CET, Anton Nefedov wrote: > BlockBackend gets deleted by another job's stream_complete(), deferred > to the main loop, so the fact that the job is put to sleep by > bdrv_drain_all_begin() doesn't really stop it from execution. I was debugging this a bit, and the block_job_defer_to_main_loop() call happens _after_ all jobs have been paused, so I think that when the BDS is drained then stream_run() finishes the last iteration without checking if it's paused. Without your patch (i.e. with a smaller STREAM_BUFFER_SIZE) then I assume that the function would have to continue looping and block_job_sleep_ns() would make the job coroutine yield, effectively pausing the job and preventing the crash. I can fix the crash by adding block_job_pause_point(&s->common) at the end of stream_run() (where the 'out' label is). I'm thinking that perhaps we should add the pause point directly to block_job_defer_to_main_loop(), to prevent any block job from running the exit function when it's paused. Somehow I had the impression that we discussed this already in the past (?) because I remember thinking about this very scenario. Berto
On 8/11/2017 5:45 PM, Alberto Garcia wrote: > On Tue 07 Nov 2017 05:19:41 PM CET, Anton Nefedov wrote: >> BlockBackend gets deleted by another job's stream_complete(), deferred >> to the main loop, so the fact that the job is put to sleep by >> bdrv_drain_all_begin() doesn't really stop it from execution. > > I was debugging this a bit, and the block_job_defer_to_main_loop() call > happens _after_ all jobs have been paused, so I think that when the BDS > is drained then stream_run() finishes the last iteration without > checking if it's paused. > > Without your patch (i.e. with a smaller STREAM_BUFFER_SIZE) then I > assume that the function would have to continue looping and > block_job_sleep_ns() would make the job coroutine yield, effectively > pausing the job and preventing the crash. > > I can fix the crash by adding block_job_pause_point(&s->common) at the > end of stream_run() (where the 'out' label is). > > I'm thinking that perhaps we should add the pause point directly to > block_job_defer_to_main_loop(), to prevent any block job from running > the exit function when it's paused. > Is it possible that the exit function is already deferred when the jobs are being paused? (even though it's at least less likely to happen) Then should we flush the bottom halves somehow in addition to putting the jobs to sleep? And also then it all probably has to happen before bdrv_reopen_queue() /Anton > Somehow I had the impression that we discussed this already in the past > (?) because I remember thinking about this very scenario. > > Berto >
On Wed 08 Nov 2017 03:45:38 PM CET, Alberto Garcia wrote: > I'm thinking that perhaps we should add the pause point directly to > block_job_defer_to_main_loop(), to prevent any block job from running > the exit function when it's paused. I was trying this and unfortunately this breaks the mirror job at least (reproduced with a simple block-commit on the topmost node, which uses commit_active_start() -> mirror_start_job()). So what happens is that mirror_run() always calls bdrv_drained_begin() before returning, pausing the block job. The closing bdrv_drained_end() call is at the end of mirror_exit(), already in the main loop. So the mirror job is always calling block_job_defer_to_main_loop() and mirror_exit() when the job is paused. Berto
On Thu, 11/09 17:26, Alberto Garcia wrote: > On Wed 08 Nov 2017 03:45:38 PM CET, Alberto Garcia wrote: > > > I'm thinking that perhaps we should add the pause point directly to > > block_job_defer_to_main_loop(), to prevent any block job from running > > the exit function when it's paused. > > I was trying this and unfortunately this breaks the mirror job at least > (reproduced with a simple block-commit on the topmost node, which uses > commit_active_start() -> mirror_start_job()). > > So what happens is that mirror_run() always calls bdrv_drained_begin() > before returning, pausing the block job. The closing bdrv_drained_end() > call is at the end of mirror_exit(), already in the main loop. > > So the mirror job is always calling block_job_defer_to_main_loop() and > mirror_exit() when the job is paused. > FWIW, I think Max's report on 194 failures is related: https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01822.html so perhaps it's worth testing his patch too: https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01835.html Fam
On Fri 10 Nov 2017 04:02:23 AM CET, Fam Zheng wrote: >> > I'm thinking that perhaps we should add the pause point directly to >> > block_job_defer_to_main_loop(), to prevent any block job from >> > running the exit function when it's paused. >> >> I was trying this and unfortunately this breaks the mirror job at >> least (reproduced with a simple block-commit on the topmost node, >> which uses commit_active_start() -> mirror_start_job()). >> >> So what happens is that mirror_run() always calls >> bdrv_drained_begin() before returning, pausing the block job. The >> closing bdrv_drained_end() call is at the end of mirror_exit(), >> already in the main loop. >> >> So the mirror job is always calling block_job_defer_to_main_loop() >> and mirror_exit() when the job is paused. > > FWIW, I think Max's report on 194 failures is related: > > https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01822.html > > so perhaps it's worth testing his patch too: > > https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01835.html Well, that doesn't solve the original crash with parallel block jobs. The root of the crash is that the mirror job manipulates the graph _while being paused_, so the BlockReopenQueue in bdrv_reopen_multiple() gets messed up, and pausing the jobs (commit 40840e419be31e) won't help. I have the impression that one major source of headaches is the fact that the reopen queue contains nodes that don't need to be reopened at all. Ideally this should be detected early on in bdrv_reopen_queue(), so there's no chance that the queue contains nodes used by a different block job. If we had that then op blockers should be enough to prevent these things. Or am I missing something? Berto
On 15/11/2017 6:42 PM, Alberto Garcia wrote: > On Fri 10 Nov 2017 04:02:23 AM CET, Fam Zheng wrote: >>>> I'm thinking that perhaps we should add the pause point directly to >>>> block_job_defer_to_main_loop(), to prevent any block job from >>>> running the exit function when it's paused. >>> >>> I was trying this and unfortunately this breaks the mirror job at >>> least (reproduced with a simple block-commit on the topmost node, >>> which uses commit_active_start() -> mirror_start_job()). >>> >>> So what happens is that mirror_run() always calls >>> bdrv_drained_begin() before returning, pausing the block job. The >>> closing bdrv_drained_end() call is at the end of mirror_exit(), >>> already in the main loop. >>> >>> So the mirror job is always calling block_job_defer_to_main_loop() >>> and mirror_exit() when the job is paused. >> >> FWIW, I think Max's report on 194 failures is related: >> >> https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01822.html >> >> so perhaps it's worth testing his patch too: >> >> https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01835.html > > Well, that doesn't solve the original crash with parallel block jobs. > The root of the crash is that the mirror job manipulates the graph > _while being paused_, so the BlockReopenQueue in bdrv_reopen_multiple() > gets messed up, and pausing the jobs (commit 40840e419be31e) won't help. > > I have the impression that one major source of headaches is the fact > that the reopen queue contains nodes that don't need to be reopened at > all. Ideally this should be detected early on in bdrv_reopen_queue(), so > there's no chance that the queue contains nodes used by a different > block job. If we had that then op blockers should be enough to prevent > these things. Or am I missing something? > > Berto > After applying Max's patch I tried the similar approach; that is keep BDSes referenced while they are in the reopen queue. Now I get the stream job hanging. Somehow one blk_root_drained_begin() is not paired with blk_root_drained_end(). So the job stays paused. Didn't dig deeper yet, but at first glance the reduced reopen queue won't help with this, as reopen drains all BDSes anyway (or can we avoid that too?) /Anton
On Wed 15 Nov 2017 05:31:20 PM CET, Anton Nefedov wrote: >> I have the impression that one major source of headaches is the fact >> that the reopen queue contains nodes that don't need to be reopened at >> all. Ideally this should be detected early on in bdrv_reopen_queue(), so >> there's no chance that the queue contains nodes used by a different >> block job. If we had that then op blockers should be enough to prevent >> these things. Or am I missing something? >> > After applying Max's patch I tried the similar approach; that is keep > BDSes referenced while they are in the reopen queue. Now I get the > stream job hanging. Somehow one blk_root_drained_begin() is not paired > with blk_root_drained_end(). So the job stays paused. I can see this if I apply Max's patch and keep refs to BDSs in the reopen queue: #0 block_job_pause (...) at blockjob.c:130 #1 0x000055c143cb586d in block_job_drained_begin (...) at blockjob.c:227 #2 0x000055c143d08067 in blk_set_dev_ops (...) at block/block-backend.c:887 #3 0x000055c143cb69db in block_job_create (...) at blockjob.c:678 #4 0x000055c143d17c0c in mirror_start_job (...) at block/mirror.c:1177 There's a ops->drained_begin(opaque) call in blk_set_dev_ops() that doesn't seem to be paired. And when you call block_job_start() then it yields immediately waiting for the resume (that never arrives). John, this change was yours (f4d9cc88ee69a5b04). Any idea? Berto
On 16/11/2017 6:54 PM, Alberto Garcia wrote: > On Wed 15 Nov 2017 05:31:20 PM CET, Anton Nefedov wrote: >>> I have the impression that one major source of headaches is the fact >>> that the reopen queue contains nodes that don't need to be reopened at >>> all. Ideally this should be detected early on in bdrv_reopen_queue(), so >>> there's no chance that the queue contains nodes used by a different >>> block job. If we had that then op blockers should be enough to prevent >>> these things. Or am I missing something? >>> >> After applying Max's patch I tried the similar approach; that is keep >> BDSes referenced while they are in the reopen queue. Now I get the >> stream job hanging. Somehow one blk_root_drained_begin() is not paired >> with blk_root_drained_end(). So the job stays paused. > > I can see this if I apply Max's patch and keep refs to BDSs in the > reopen queue: > > #0 block_job_pause (...) at blockjob.c:130 > #1 0x000055c143cb586d in block_job_drained_begin (...) at blockjob.c:227 > #2 0x000055c143d08067 in blk_set_dev_ops (...) at block/block-backend.c:887 > #3 0x000055c143cb69db in block_job_create (...) at blockjob.c:678 > #4 0x000055c143d17c0c in mirror_start_job (...) at block/mirror.c:1177 > > There's a ops->drained_begin(opaque) call in blk_set_dev_ops() that > doesn't seem to be paired. > My understanding for now is 1. in bdrv_drain_all_begin(), BDS gets drained with bdrv_parent_drained_begin(), all the parents get blk_root_drained_begin(), pause their jobs. 2. one of the jobs finishes and deletes BB. 3. in bdrv_drain_all_end(), bdrv_parent_drained_end() is never called because even though BDS still exists (referenced in the queue), it cannot be accessed as bdrv_next() takes BDS from the global BB list (and the corresponding BB is deleted). Max's patch v1 could have helped: http://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg01835.html Or, perhaps another approach, keep BlockJob referenced while it is paused (by block_job_pause/resume_all()). That should prevent it from deleting the BB. > And when you call block_job_start() then it > yields immediately waiting for the resume (that never arrives). > > John, this change was yours (f4d9cc88ee69a5b04). Any idea? > > Berto >
On 11/16/2017 10:54 AM, Alberto Garcia wrote: > On Wed 15 Nov 2017 05:31:20 PM CET, Anton Nefedov wrote: >>> I have the impression that one major source of headaches is the fact >>> that the reopen queue contains nodes that don't need to be reopened at >>> all. Ideally this should be detected early on in bdrv_reopen_queue(), so >>> there's no chance that the queue contains nodes used by a different >>> block job. If we had that then op blockers should be enough to prevent >>> these things. Or am I missing something? >>> >> After applying Max's patch I tried the similar approach; that is keep >> BDSes referenced while they are in the reopen queue. Now I get the >> stream job hanging. Somehow one blk_root_drained_begin() is not paired >> with blk_root_drained_end(). So the job stays paused. > > I can see this if I apply Max's patch and keep refs to BDSs in the > reopen queue: > > #0 block_job_pause (...) at blockjob.c:130 > #1 0x000055c143cb586d in block_job_drained_begin (...) at blockjob.c:227 > #2 0x000055c143d08067 in blk_set_dev_ops (...) at block/block-backend.c:887 > #3 0x000055c143cb69db in block_job_create (...) at blockjob.c:678 > #4 0x000055c143d17c0c in mirror_start_job (...) at block/mirror.c:1177 > > There's a ops->drained_begin(opaque) call in blk_set_dev_ops() that > doesn't seem to be paired. And when you call block_job_start() then it > yields immediately waiting for the resume (that never arrives). > > John, this change was yours (f4d9cc88ee69a5b04). Any idea? > > Berto > The idea at the time was that if you tell the BlockBackend to drain and then attach a job to it, once you go to *end* the drained region you'd have a mismatched begin/end pair. To allow for some flexibility and to make sure that you *didn't* have a mismatched begin/end call, what I did was that if you attach dev_ops to an already drained backend (i.e. we "missed our chance" to issue the drained_begin) we play catch-up and issue the drained call. There's no matching call here, because I anticipated whoever initially bumped that quiesce_counter up to be issuing the drained_end, which will then be propagated according to the dev_ops structure in place.
On Thu 16 Nov 2017 10:56:58 PM CET, John Snow wrote: >>>> I have the impression that one major source of headaches is the fact >>>> that the reopen queue contains nodes that don't need to be reopened at >>>> all. Ideally this should be detected early on in bdrv_reopen_queue(), so >>>> there's no chance that the queue contains nodes used by a different >>>> block job. If we had that then op blockers should be enough to prevent >>>> these things. Or am I missing something? >>>> >>> After applying Max's patch I tried the similar approach; that is keep >>> BDSes referenced while they are in the reopen queue. Now I get the >>> stream job hanging. Somehow one blk_root_drained_begin() is not paired >>> with blk_root_drained_end(). So the job stays paused. >> >> I can see this if I apply Max's patch and keep refs to BDSs in the >> reopen queue: >> >> #0 block_job_pause (...) at blockjob.c:130 >> #1 0x000055c143cb586d in block_job_drained_begin (...) at blockjob.c:227 >> #2 0x000055c143d08067 in blk_set_dev_ops (...) at block/block-backend.c:887 >> #3 0x000055c143cb69db in block_job_create (...) at blockjob.c:678 >> #4 0x000055c143d17c0c in mirror_start_job (...) at block/mirror.c:1177 >> >> There's a ops->drained_begin(opaque) call in blk_set_dev_ops() that >> doesn't seem to be paired. And when you call block_job_start() then it >> yields immediately waiting for the resume (that never arrives). >> >> John, this change was yours (f4d9cc88ee69a5b04). Any idea? > > The idea at the time was that if you tell the BlockBackend to drain and > then attach a job to it, once you go to *end* the drained region you'd > have a mismatched begin/end pair. > > To allow for some flexibility and to make sure that you *didn't* have a > mismatched begin/end call, what I did was that if you attach dev_ops to > an already drained backend (i.e. we "missed our chance" to issue the > drained_begin) we play catch-up and issue the drained call. > > There's no matching call here, because I anticipated whoever initially > bumped that quiesce_counter up to be issuing the drained_end, which will > then be propagated according to the dev_ops structure in place. I see, thanks! Berto
On Thu 16 Nov 2017 05:09:59 PM CET, Anton Nefedov wrote: >>>> I have the impression that one major source of headaches is the >>>> fact that the reopen queue contains nodes that don't need to be >>>> reopened at all. Ideally this should be detected early on in >>>> bdrv_reopen_queue(), so there's no chance that the queue contains >>>> nodes used by a different block job. If we had that then op >>>> blockers should be enough to prevent these things. Or am I missing >>>> something? >>>> >>> After applying Max's patch I tried the similar approach; that is >>> keep BDSes referenced while they are in the reopen queue. Now I get >>> the stream job hanging. Somehow one blk_root_drained_begin() is not >>> paired with blk_root_drained_end(). So the job stays paused. >> >> I can see this if I apply Max's patch and keep refs to BDSs in the >> reopen queue: >> >> #0 block_job_pause (...) at blockjob.c:130 >> #1 0x000055c143cb586d in block_job_drained_begin (...) at blockjob.c:227 >> #2 0x000055c143d08067 in blk_set_dev_ops (...) at block/block-backend.c:887 >> #3 0x000055c143cb69db in block_job_create (...) at blockjob.c:678 >> #4 0x000055c143d17c0c in mirror_start_job (...) at block/mirror.c:1177 >> >> There's a ops->drained_begin(opaque) call in blk_set_dev_ops() that >> doesn't seem to be paired. > > My understanding for now is > > 1. in bdrv_drain_all_begin(), BDS gets drained with > bdrv_parent_drained_begin(), all the parents get > blk_root_drained_begin(), pause their jobs. > 2. one of the jobs finishes and deletes BB. > 3. in bdrv_drain_all_end(), bdrv_parent_drained_end() is never > called because even though BDS still exists (referenced in the > queue), it cannot be accessed as bdrv_next() takes BDS from the > global BB list (and the corresponding BB is deleted). Yes, I was debugging this and I got to a similar conclusion. With test_stream_commit from iotest 030 I can see that 1. the block-stream job is created first, then stream_run begins and starts copying the data. 2. block-commit starts and tries to reopen the top image in read-write mode. This pauses the stream block job and drains all BDSs. 3. The draining causes the stream job to finish, it is deferred to the main loop, then stream_complete finishes and unrefs the block job, deleting it. At the point of deletion the pause count was still > 0 (because of step (2)) > Max's patch v1 could have helped: > http://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg01835.html This doesn't solve the problem. > Or, perhaps another approach, keep BlockJob referenced while it is > paused (by block_job_pause/resume_all()). That should prevent it from > deleting the BB. Yes, I tried this and it actually solves the issue. But I still think that the problem is that block jobs are allowed to finish when they are paused. Adding block_job_pause_point(&s->common) at the end of stream_run() fixes the problem too. Berto
On 21/11/2017 3:51 PM, Alberto Garcia wrote: > On Thu 16 Nov 2017 05:09:59 PM CET, Anton Nefedov wrote: >>>>> I have the impression that one major source of headaches is the >>>>> fact that the reopen queue contains nodes that don't need to be >>>>> reopened at all. Ideally this should be detected early on in >>>>> bdrv_reopen_queue(), so there's no chance that the queue contains >>>>> nodes used by a different block job. If we had that then op >>>>> blockers should be enough to prevent these things. Or am I missing >>>>> something? >>>>> >>>> After applying Max's patch I tried the similar approach; that is >>>> keep BDSes referenced while they are in the reopen queue. Now I get >>>> the stream job hanging. Somehow one blk_root_drained_begin() is not >>>> paired with blk_root_drained_end(). So the job stays paused. >>> >>> I can see this if I apply Max's patch and keep refs to BDSs in the >>> reopen queue: >>> >>> #0 block_job_pause (...) at blockjob.c:130 >>> #1 0x000055c143cb586d in block_job_drained_begin (...) at blockjob.c:227 >>> #2 0x000055c143d08067 in blk_set_dev_ops (...) at block/block-backend.c:887 >>> #3 0x000055c143cb69db in block_job_create (...) at blockjob.c:678 >>> #4 0x000055c143d17c0c in mirror_start_job (...) at block/mirror.c:1177 >>> >>> There's a ops->drained_begin(opaque) call in blk_set_dev_ops() that >>> doesn't seem to be paired. >> >> My understanding for now is >> >> 1. in bdrv_drain_all_begin(), BDS gets drained with >> bdrv_parent_drained_begin(), all the parents get >> blk_root_drained_begin(), pause their jobs. >> 2. one of the jobs finishes and deletes BB. >> 3. in bdrv_drain_all_end(), bdrv_parent_drained_end() is never >> called because even though BDS still exists (referenced in the >> queue), it cannot be accessed as bdrv_next() takes BDS from the >> global BB list (and the corresponding BB is deleted). > > Yes, I was debugging this and I got to a similar conclusion. With > test_stream_commit from iotest 030 I can see that > > 1. the block-stream job is created first, then stream_run begins and > starts copying the data. > 2. block-commit starts and tries to reopen the top image in > read-write mode. This pauses the stream block job and drains all > BDSs. > 3. The draining causes the stream job to finish, it is deferred to > the main loop, then stream_complete finishes and unrefs the block > job, deleting it. At the point of deletion the pause count was > still > 0 (because of step (2)) > >> Max's patch v1 could have helped: >> http://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg01835.html > > This doesn't solve the problem. > >> Or, perhaps another approach, keep BlockJob referenced while it is >> paused (by block_job_pause/resume_all()). That should prevent it from >> deleting the BB. > > Yes, I tried this and it actually solves the issue. But I still think > that the problem is that block jobs are allowed to finish when they are > paused. > Agree, but > Adding block_job_pause_point(&s->common) at the end of stream_run() > fixes the problem too. > would be a nice fix, but it only works unless the job is already deferred, right? And the BH can't be held off while job->paused because with mirror job it is always the case. So we cover most of the cases but also lose the stable reproduction. This: >> keep BlockJob referenced while it is >> paused (by block_job_pause/resume_all()). That should prevent it from >> deleting the BB. looks kind of hacky; maybe referencing in block_job_pause() (and not just pause_all) seems more correct? I think it didn't work for me right away though. But I can look more. /Anton > Berto >
On Tue 21 Nov 2017 04:18:13 PM CET, Anton Nefedov wrote: >>> Or, perhaps another approach, keep BlockJob referenced while it is >>> paused (by block_job_pause/resume_all()). That should prevent it >>> from deleting the BB. >> >> Yes, I tried this and it actually solves the issue. But I still think >> that the problem is that block jobs are allowed to finish when they >> are paused. > > Agree, but > >> Adding block_job_pause_point(&s->common) at the end of stream_run() >> fixes the problem too. > > would be a nice fix, but it only works unless the job is already > deferred, right? Right, I didn't mean to propose it as the proper solution (it would still leave mirror job vulnerable because it's already paused by the time it calls defer_to_main_loop()). > This: > > >> keep BlockJob referenced while it is > >> paused (by block_job_pause/resume_all()). That should prevent it from > >> deleting the BB. > > looks kind of hacky; maybe referencing in block_job_pause() (and not > just pause_all) seems more correct? I think it didn't work for me > right away though. But I can look more. You have to be careful when you unref the block job because you may destroy it, and therefore block_job_next() in block_job_resume_all() would be using freed memory. Berto
CC Jeff Cody ... who may or may not be preoccupied with Thanksgiving travel now. Convenient URL for reading past replies: https://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg03844.html On 11/21/2017 10:31 AM, Alberto Garcia wrote: > On Tue 21 Nov 2017 04:18:13 PM CET, Anton Nefedov wrote: > >>>> Or, perhaps another approach, keep BlockJob referenced while it is >>>> paused (by block_job_pause/resume_all()). That should prevent it >>>> from deleting the BB. >>> >>> Yes, I tried this and it actually solves the issue. But I still think >>> that the problem is that block jobs are allowed to finish when they >>> are paused. >> >> Agree, but >> >>> Adding block_job_pause_point(&s->common) at the end of stream_run() >>> fixes the problem too. >> >> would be a nice fix, but it only works unless the job is already >> deferred, right? > > Right, I didn't mean to propose it as the proper solution (it would > still leave mirror job vulnerable because it's already paused by the > time it calls defer_to_main_loop()). > >> This: >> >> >> keep BlockJob referenced while it is >> >> paused (by block_job_pause/resume_all()). That should prevent it from >> >> deleting the BB. >> >> looks kind of hacky; maybe referencing in block_job_pause() (and not >> just pause_all) seems more correct? I think it didn't work for me >> right away though. But I can look more. > > You have to be careful when you unref the block job because you may > destroy it, and therefore block_job_next() in block_job_resume_all() > would be using freed memory. > > Berto >
On Tue 07 Nov 2017 05:19:41 PM CET, Anton Nefedov wrote: > The small attached tweak makes iotest 30 (-qcow2 -nocache) fail 10/10 > times on my machine. I can reproduce the crash with v2.11.0-rc2 without having to modify QEMU at all using the attached test case (it's based on one of the existing tests from 030). It doesn't fail 100% of the times, though, but enough to reproduce it easily. Berto ---- class TestStreamCommit(iotests.QMPTestCase): num_imgs = 6 imgs = [] def setUp(self): opts = [] self.imgs = [] # Initialize file names and command-line options for i in range(self.num_imgs): img_depth = self.num_imgs - i - 1 opts.append("backing." * img_depth + "node-name=node%d" % i) self.imgs.append(os.path.join(iotests.test_dir, 'img-%d.img' % i)) # Create all images qemu_img('create', '-f', iotests.imgfmt, self.imgs[0], '1M') for i in range(1, self.num_imgs): qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % self.imgs[i-1], self.imgs[i]) qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0x1 256k 512k', self.imgs[1]) qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0x2 768k 256k', self.imgs[1]) qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0x3 0 1M', self.imgs[5]) # Attach the drive to the VM self.vm = iotests.VM() self.vm.add_drive(self.imgs[-1], ','.join(opts)) self.vm.launch() def tearDown(self): self.vm.shutdown() for img in self.imgs: os.remove(img) # Test a block-stream and a block-commit job in parallel def test_stream_commit(self): self.assert_no_active_block_jobs() # Stream from node0 into node2 result = self.vm.qmp('block-stream', device='node2', job_id='node2') self.assert_qmp(result, 'return', {}) # Commit from the active layer into node3 result = self.vm.qmp('block-commit', device='drive0', base=self.imgs[3]) self.assert_qmp(result, 'return', {}) # Wait for all jobs to be finished. pending_jobs = ['node2', 'drive0'] while len(pending_jobs) > 0: for event in self.vm.get_qmp_events(wait=True): if event['event'] == 'BLOCK_JOB_COMPLETED': node_name = self.dictpath(event, 'data/device') self.assertTrue(node_name in pending_jobs) self.assert_qmp_absent(event, 'data/error') pending_jobs.remove(node_name) if event['event'] == 'BLOCK_JOB_READY': self.assert_qmp(event, 'data/device', 'drive0') self.assert_qmp(event, 'data/type', 'commit') self.assert_qmp_absent(event, 'data/error') self.assertTrue('drive0' in pending_jobs) self.vm.qmp('block-job-complete', device='drive0') self.assert_no_active_block_jobs()
diff --git a/block/stream.c b/block/stream.c index 52d329f..74e980c 100644 --- a/block/stream.c +++ b/block/stream.c @@ -26,7 +26,7 @@ enum { * enough to process multiple clusters in a single call, so that populating * contiguous regions of the image is efficient. */ - STREAM_BUFFER_SIZE = 512 * 1024, /* in bytes */ + STREAM_BUFFER_SIZE = 8 * 1024 * 1024, /* in bytes */ };