mbox series

[for-6.0?,0/3] job: Add job_wait_unpaused() for block-job-complete

Message ID 20210408162039.242670-1-mreitz@redhat.com (mailing list archive)
Headers show
Series job: Add job_wait_unpaused() for block-job-complete | expand

Message

Max Reitz April 8, 2021, 4:20 p.m. UTC
Hi,

See patch 1 for a detailed explanation of the problem.

The gist is: Draining a READY job makes it transition to STANDBY, and
jobs on STANDBY cannot be completed.  Ending the drained section will
schedule the job (so it is then resumed), but not wait until it is
actually running again.

Therefore, it can happen that issuing block-job-complete fails when you
issue it right after some draining operation.

I tried to come up with an iotest reproducer, but in the end I only got
something that reproduced the issue like 2/10 times, and it required
heavy I/O, so it is nothing I would like to have as part of the iotests.
Instead, I opted for a unit test, which allows me to cheat a bit
(specifically, locking the job IO thread before ending the drained
section).


Max Reitz (3):
  job: Add job_wait_unpaused() for block-job-complete
  test-blockjob: Test job_wait_unpaused()
  iotests/041: block-job-complete on user-paused job

 include/qemu/job.h         |  15 ++++
 blockdev.c                 |   3 +
 job.c                      |  42 +++++++++++
 tests/unit/test-blockjob.c | 140 +++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/041     |  13 +++-
 5 files changed, 212 insertions(+), 1 deletion(-)

Comments

John Snow April 8, 2021, 5:09 p.m. UTC | #1
On 4/8/21 12:20 PM, Max Reitz wrote:
> Hi,
> 
> See patch 1 for a detailed explanation of the problem.
> 
> The gist is: Draining a READY job makes it transition to STANDBY, and
> jobs on STANDBY cannot be completed.  Ending the drained section will
> schedule the job (so it is then resumed), but not wait until it is
> actually running again.
> 
> Therefore, it can happen that issuing block-job-complete fails when you
> issue it right after some draining operation.
> 
> I tried to come up with an iotest reproducer, but in the end I only got
> something that reproduced the issue like 2/10 times, and it required
> heavy I/O, so it is nothing I would like to have as part of the iotests.
> Instead, I opted for a unit test, which allows me to cheat a bit
> (specifically, locking the job IO thread before ending the drained
> section).
> 
> 
> Max Reitz (3):
>    job: Add job_wait_unpaused() for block-job-complete
>    test-blockjob: Test job_wait_unpaused()
>    iotests/041: block-job-complete on user-paused job
> 
>   include/qemu/job.h         |  15 ++++
>   blockdev.c                 |   3 +
>   job.c                      |  42 +++++++++++
>   tests/unit/test-blockjob.c | 140 +++++++++++++++++++++++++++++++++++++
>   tests/qemu-iotests/041     |  13 +++-
>   5 files changed, 212 insertions(+), 1 deletion(-)
> 

Left comments and review on #1, skimmed 2/3. Not sure if it's 
appropriate for 6.0 yet, that might depend on the responses to my 
comments and other reviewers and so on.

Acked-by: John Snow <jsnow@redhat.com>