Message ID | 20210726144613.954844-5-mreitz@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mirror: Handle errors after READY cancel | expand |
On Mon, Jul 26, 2021 at 04:46:10PM +0200, Max Reitz wrote: > We largely have two cancel modes for jobs: > > First, there is actual cancelling. The job is terminated as soon as > possible, without trying to reach a consistent result. > > Second, we have mirror in the READY state. Technically, the job is not > really cancelled, but it just is a different completion mode. The job > can still run for an indefinite amount of time while it tries to reach a > consistent result. > > We want to be able to clearly distinguish which cancel mode a job is in > (when it has been cancelled). We can use Job.force_cancel for this, but > right now it only reflects cancel requests from the user with > force=true, but clearly, jobs that do not even distinguish between > force=false and force=true are effectively always force-cancelled. > > So this patch has Job.force_cancel signify whether the job will > terminate as soon as possible (force_cancel=true) or whether it will > effectively remain running despite being "cancelled" > (force_cancel=false). > > To this end, we let jobs that provide JobDriver.cancel() tell the > generic job code whether they will terminate as soon as possible or not, > and for jobs that do not provide that method we assume they will. > > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > include/qemu/job.h | 11 ++++++++++- > block/backup.c | 3 ++- > block/mirror.c | 24 ++++++++++++++++++------ > job.c | 6 +++++- > 4 files changed, 35 insertions(+), 9 deletions(-) > Reviewed-by: Eric Blake <eblake@redhat.com>
26.07.2021 17:46, Max Reitz wrote: > We largely have two cancel modes for jobs: > > First, there is actual cancelling. The job is terminated as soon as > possible, without trying to reach a consistent result. > > Second, we have mirror in the READY state. Technically, the job is not > really cancelled, but it just is a different completion mode. The job > can still run for an indefinite amount of time while it tries to reach a > consistent result. > > We want to be able to clearly distinguish which cancel mode a job is in > (when it has been cancelled). We can use Job.force_cancel for this, but > right now it only reflects cancel requests from the user with > force=true, but clearly, jobs that do not even distinguish between > force=false and force=true are effectively always force-cancelled. > > So this patch has Job.force_cancel signify whether the job will > terminate as soon as possible (force_cancel=true) or whether it will > effectively remain running despite being "cancelled" > (force_cancel=false). > > To this end, we let jobs that provide JobDriver.cancel() tell the > generic job code whether they will terminate as soon as possible or not, > and for jobs that do not provide that method we assume they will. > > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > include/qemu/job.h | 11 ++++++++++- > block/backup.c | 3 ++- > block/mirror.c | 24 ++++++++++++++++++------ > job.c | 6 +++++- > 4 files changed, 35 insertions(+), 9 deletions(-) > > diff --git a/include/qemu/job.h b/include/qemu/job.h > index 5e8edbc2c8..8aa90f7395 100644 > --- a/include/qemu/job.h > +++ b/include/qemu/job.h > @@ -253,8 +253,17 @@ struct JobDriver { > > /** > * If the callback is not NULL, it will be invoked in job_cancel_async > + * > + * This function must return true if the job will be cancelled > + * immediately without any further I/O (mandatory if @force is > + * true), and false otherwise. This lets the generic job layer > + * know whether a job has been truly (force-)cancelled, or whether > + * it is just in a special completion mode (like mirror after > + * READY). > + * (If the callback is NULL, the job is assumed to terminate > + * without I/O.) > */ > - void (*cancel)(Job *job, bool force); > + bool (*cancel)(Job *job, bool force); > > > /** Called when the job is freed */ > diff --git a/block/backup.c b/block/backup.c > index bd3614ce70..513e1c8a0b 100644 > --- a/block/backup.c > +++ b/block/backup.c > @@ -331,11 +331,12 @@ static void coroutine_fn backup_set_speed(BlockJob *job, int64_t speed) > } > } > > -static void backup_cancel(Job *job, bool force) > +static bool backup_cancel(Job *job, bool force) > { > BackupBlockJob *s = container_of(job, BackupBlockJob, common.job); > > bdrv_cancel_in_flight(s->target_bs); > + return true; > } > > static const BlockJobDriver backup_job_driver = { > diff --git a/block/mirror.c b/block/mirror.c > index fcb7b65f93..e93631a9f6 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -1087,9 +1087,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp) > trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job), > delay_ns); > job_sleep_ns(&s->common.job, delay_ns); > - if (job_is_cancelled(&s->common.job) && > - (!job_is_ready(&s->common.job) || s->common.job.force_cancel)) > - { > + if (job_is_cancelled(&s->common.job) && s->common.job.force_cancel) { Seems, it could it be reduced to if (s->common.job.force_cancel) { > break; > } > s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > @@ -1102,7 +1100,7 @@ immediate_exit: > * the target is a copy of the source. > */ > assert(ret < 0 || > - ((s->common.job.force_cancel || !job_is_ready(&s->common.job)) && > + (s->common.job.force_cancel && and here > job_is_cancelled(&s->common.job))); > assert(need_drain); > mirror_wait_for_all_io(s); > @@ -1188,14 +1186,27 @@ static bool mirror_drained_poll(BlockJob *job) > return !!s->in_flight; > } > anyway: Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
diff --git a/include/qemu/job.h b/include/qemu/job.h index 5e8edbc2c8..8aa90f7395 100644 --- a/include/qemu/job.h +++ b/include/qemu/job.h @@ -253,8 +253,17 @@ struct JobDriver { /** * If the callback is not NULL, it will be invoked in job_cancel_async + * + * This function must return true if the job will be cancelled + * immediately without any further I/O (mandatory if @force is + * true), and false otherwise. This lets the generic job layer + * know whether a job has been truly (force-)cancelled, or whether + * it is just in a special completion mode (like mirror after + * READY). + * (If the callback is NULL, the job is assumed to terminate + * without I/O.) */ - void (*cancel)(Job *job, bool force); + bool (*cancel)(Job *job, bool force); /** Called when the job is freed */ diff --git a/block/backup.c b/block/backup.c index bd3614ce70..513e1c8a0b 100644 --- a/block/backup.c +++ b/block/backup.c @@ -331,11 +331,12 @@ static void coroutine_fn backup_set_speed(BlockJob *job, int64_t speed) } } -static void backup_cancel(Job *job, bool force) +static bool backup_cancel(Job *job, bool force) { BackupBlockJob *s = container_of(job, BackupBlockJob, common.job); bdrv_cancel_in_flight(s->target_bs); + return true; } static const BlockJobDriver backup_job_driver = { diff --git a/block/mirror.c b/block/mirror.c index fcb7b65f93..e93631a9f6 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1087,9 +1087,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp) trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job), delay_ns); job_sleep_ns(&s->common.job, delay_ns); - if (job_is_cancelled(&s->common.job) && - (!job_is_ready(&s->common.job) || s->common.job.force_cancel)) - { + if (job_is_cancelled(&s->common.job) && s->common.job.force_cancel) { break; } s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); @@ -1102,7 +1100,7 @@ immediate_exit: * the target is a copy of the source. */ assert(ret < 0 || - ((s->common.job.force_cancel || !job_is_ready(&s->common.job)) && + (s->common.job.force_cancel && job_is_cancelled(&s->common.job))); assert(need_drain); mirror_wait_for_all_io(s); @@ -1188,14 +1186,27 @@ static bool mirror_drained_poll(BlockJob *job) return !!s->in_flight; } -static void mirror_cancel(Job *job, bool force) +static bool mirror_cancel(Job *job, bool force) { MirrorBlockJob *s = container_of(job, MirrorBlockJob, common.job); BlockDriverState *target = blk_bs(s->target); - if (force || !job_is_ready(job)) { + /* + * Before the job is READY, we treat any cancellation like a + * force-cancellation. + */ + force = force || !job_is_ready(job); + + if (force) { bdrv_cancel_in_flight(target); } + return force; +} + +static bool commit_active_cancel(Job *job, bool force) +{ + /* Same as above in mirror_cancel() */ + return force || !job_is_ready(job); } static const BlockJobDriver mirror_job_driver = { @@ -1225,6 +1236,7 @@ static const BlockJobDriver commit_active_job_driver = { .abort = mirror_abort, .pause = mirror_pause, .complete = mirror_complete, + .cancel = commit_active_cancel, }, .drained_poll = mirror_drained_poll, }; diff --git a/job.c b/job.c index 9e971d64cf..e78d893a9c 100644 --- a/job.c +++ b/job.c @@ -719,8 +719,12 @@ static int job_finalize_single(Job *job) static void job_cancel_async(Job *job, bool force) { if (job->driver->cancel) { - job->driver->cancel(job, force); + force = job->driver->cancel(job, force); + } else { + /* No .cancel() means the job will behave as if force-cancelled */ + force = true; } + if (job->user_paused) { /* Do not call job_enter here, the caller will handle it. */ if (job->driver->user_resume) {
We largely have two cancel modes for jobs: First, there is actual cancelling. The job is terminated as soon as possible, without trying to reach a consistent result. Second, we have mirror in the READY state. Technically, the job is not really cancelled, but it just is a different completion mode. The job can still run for an indefinite amount of time while it tries to reach a consistent result. We want to be able to clearly distinguish which cancel mode a job is in (when it has been cancelled). We can use Job.force_cancel for this, but right now it only reflects cancel requests from the user with force=true, but clearly, jobs that do not even distinguish between force=false and force=true are effectively always force-cancelled. So this patch has Job.force_cancel signify whether the job will terminate as soon as possible (force_cancel=true) or whether it will effectively remain running despite being "cancelled" (force_cancel=false). To this end, we let jobs that provide JobDriver.cancel() tell the generic job code whether they will terminate as soon as possible or not, and for jobs that do not provide that method we assume they will. Signed-off-by: Max Reitz <mreitz@redhat.com> --- include/qemu/job.h | 11 ++++++++++- block/backup.c | 3 ++- block/mirror.c | 24 ++++++++++++++++++------ job.c | 6 +++++- 4 files changed, 35 insertions(+), 9 deletions(-)