mbox series

[v2,00/10] mirror: allow switching from background to active mode

Message ID 20231009094619.469668-1-f.ebner@proxmox.com (mailing list archive)
Headers show
Series mirror: allow switching from background to active mode | expand

Message

Fiona Ebner Oct. 9, 2023, 9:46 a.m. UTC
Changes in v2:
    * move bitmap to filter which allows to avoid draining when
      changing the copy mode
    * add patch to determine copy_to_target only once
    * drop patches returning redundant information upon query
    * update QEMU version in QAPI
    * update indentation in QAPI
    * update indentation in QAPI (like in a937b6aa73 ("qapi: Reformat
      doc comments to conform to current conventions"))
    * add patch to adapt iotest output

Discussion of v1:
https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg07216.html

With active mode, the guest write speed is limited by the synchronous
writes to the mirror target. For this reason, management applications
might want to start out in background mode and only switch to active
mode later, when certain conditions are met. This series adds a
block-job-change QMP command to achieve that, as well as
job-type-specific information when querying block jobs, which
can be used to decide when the switch should happen.

For now, only the direction background -> active is supported.

The information added upon querying is whether the target is actively
synced, the total data sent, and the remaining dirty bytes.

Initially, I tried to go for a more general 'job-change' command, but
I couldn't figure out a way to avoid mutual inclusion between
block-core.json and job.json.


Fiona Ebner (10):
  blockjob: introduce block-job-change QMP command
  block/mirror: set actively_synced even after the job is ready
  block/mirror: move dirty bitmap to filter
  block/mirror: determine copy_to_target only once
  mirror: implement mirror_change method
  qapi/block-core: use JobType for BlockJobInfo's type
  qapi/block-core: turn BlockJobInfo into a union
  blockjob: query driver-specific info via a new 'query' driver method
  mirror: return mirror-specific information upon query
  iotests: adapt test output for new mirror query property

 block/mirror.c                 | 95 +++++++++++++++++++++++-----------
 block/monitor/block-hmp-cmds.c |  4 +-
 blockdev.c                     | 14 +++++
 blockjob.c                     | 26 +++++++++-
 include/block/blockjob.h       | 11 ++++
 include/block/blockjob_int.h   | 10 ++++
 job.c                          |  1 +
 qapi/block-core.json           | 59 +++++++++++++++++++--
 qapi/job.json                  |  4 +-
 tests/qemu-iotests/109.out     | 24 ++++-----
 10 files changed, 199 insertions(+), 49 deletions(-)

Comments

Vladimir Sementsov-Ogievskiy Oct. 10, 2023, 5:55 p.m. UTC | #1
On 09.10.23 12:46, Fiona Ebner wrote:
> Changes in v2:
>      * move bitmap to filter which allows to avoid draining when
>        changing the copy mode
>      * add patch to determine copy_to_target only once
>      * drop patches returning redundant information upon query
>      * update QEMU version in QAPI
>      * update indentation in QAPI
>      * update indentation in QAPI (like in a937b6aa73 ("qapi: Reformat
>        doc comments to conform to current conventions"))
>      * add patch to adapt iotest output
> 
> Discussion of v1:
> https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg07216.html
> 
> With active mode, the guest write speed is limited by the synchronous
> writes to the mirror target. For this reason, management applications
> might want to start out in background mode and only switch to active
> mode later, when certain conditions are met. This series adds a
> block-job-change QMP command to achieve that, as well as
> job-type-specific information when querying block jobs, which
> can be used to decide when the switch should happen.
> 
> For now, only the direction background -> active is supported.
> 
> The information added upon querying is whether the target is actively
> synced, the total data sent, and the remaining dirty bytes.
> 
> Initially, I tried to go for a more general 'job-change' command, but
> I couldn't figure out a way to avoid mutual inclusion between
> block-core.json and job.json.
> 

What is the problem with it? I still think that job-change would be better.

> 
> Fiona Ebner (10):
>    blockjob: introduce block-job-change QMP command
>    block/mirror: set actively_synced even after the job is ready
>    block/mirror: move dirty bitmap to filter
>    block/mirror: determine copy_to_target only once
>    mirror: implement mirror_change method
>    qapi/block-core: use JobType for BlockJobInfo's type
>    qapi/block-core: turn BlockJobInfo into a union
>    blockjob: query driver-specific info via a new 'query' driver method
>    mirror: return mirror-specific information upon query
>    iotests: adapt test output for new mirror query property
> 
>   block/mirror.c                 | 95 +++++++++++++++++++++++-----------
>   block/monitor/block-hmp-cmds.c |  4 +-
>   blockdev.c                     | 14 +++++
>   blockjob.c                     | 26 +++++++++-
>   include/block/blockjob.h       | 11 ++++
>   include/block/blockjob_int.h   | 10 ++++
>   job.c                          |  1 +
>   qapi/block-core.json           | 59 +++++++++++++++++++--
>   qapi/job.json                  |  4 +-
>   tests/qemu-iotests/109.out     | 24 ++++-----
>   10 files changed, 199 insertions(+), 49 deletions(-)
>
Vladimir Sementsov-Ogievskiy Oct. 10, 2023, 8:01 p.m. UTC | #2
On 10.10.23 20:55, Vladimir Sementsov-Ogievskiy wrote:
> On 09.10.23 12:46, Fiona Ebner wrote:
>> Changes in v2:
>>      * move bitmap to filter which allows to avoid draining when
>>        changing the copy mode
>>      * add patch to determine copy_to_target only once
>>      * drop patches returning redundant information upon query
>>      * update QEMU version in QAPI
>>      * update indentation in QAPI
>>      * update indentation in QAPI (like in a937b6aa73 ("qapi: Reformat
>>        doc comments to conform to current conventions"))
>>      * add patch to adapt iotest output
>>
>> Discussion of v1:
>> https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg07216.html
>>
>> With active mode, the guest write speed is limited by the synchronous
>> writes to the mirror target. For this reason, management applications
>> might want to start out in background mode and only switch to active
>> mode later, when certain conditions are met. This series adds a
>> block-job-change QMP command to achieve that, as well as
>> job-type-specific information when querying block jobs, which
>> can be used to decide when the switch should happen.
>>
>> For now, only the direction background -> active is supported.
>>
>> The information added upon querying is whether the target is actively
>> synced, the total data sent, and the remaining dirty bytes.
>>
>> Initially, I tried to go for a more general 'job-change' command, but
>> I couldn't figure out a way to avoid mutual inclusion between
>> block-core.json and job.json.
>>
> 
> What is the problem with it? I still think that job-change would be better.

However, that's not a show-stopper. We have block-job-* commands, they are not deprecated, no problem to add one more.
Fiona Ebner Oct. 11, 2023, 10:18 a.m. UTC | #3
Am 10.10.23 um 19:55 schrieb Vladimir Sementsov-Ogievskiy:
> On 09.10.23 12:46, Fiona Ebner wrote:
>>
>> Initially, I tried to go for a more general 'job-change' command, but
>> I couldn't figure out a way to avoid mutual inclusion between
>> block-core.json and job.json.
>>
> 
> What is the problem with it? I still think that job-change would be better.
> 

If going for job-change in job.json, the dependencies would be

job-change -> JobChangeOptions -> JobChangeOptionsMirror -> MirrorCopyMode

query-jobs -> JobInfo -> JobInfoMirror

and we can't include block-core.json in job.json, because an inclusion
loop gives a build error.

Could be made to work by moving MirrorCopyMode (and
JobChangeOptionsMirror, JobInfoMirror) to job.json or some place that
can be included by both job.json and block-core.json. Moving the
type-specific definitions to the general job.json didn't feel right to
me. Including another file with type-specific definitions in job.json
feels slightly less wrong, but still not quite right and I didn't want
to create a new file just for MirrorCopyMode (and
JobChangeOptionsMirror, JobInfoMirror).

And going further and moving all mirror-related things to a separate
file would require moving along things like NewImageMode with it or
create yet another file for such general things used by multiple block-jobs.

If preferred, I can try and go with some version of the above.

Best Regards,
Fiona
Vladimir Sementsov-Ogievskiy Oct. 12, 2023, 2:10 p.m. UTC | #4
On 11.10.23 13:18, Fiona Ebner wrote:
> Am 10.10.23 um 19:55 schrieb Vladimir Sementsov-Ogievskiy:
>> On 09.10.23 12:46, Fiona Ebner wrote:
>>>
>>> Initially, I tried to go for a more general 'job-change' command, but
>>> I couldn't figure out a way to avoid mutual inclusion between
>>> block-core.json and job.json.
>>>
>>
>> What is the problem with it? I still think that job-change would be better.
>>
> 
> If going for job-change in job.json, the dependencies would be
> 
> job-change -> JobChangeOptions -> JobChangeOptionsMirror -> MirrorCopyMode
> 
> query-jobs -> JobInfo -> JobInfoMirror
> 
> and we can't include block-core.json in job.json, because an inclusion
> loop gives a build error.
> 
> Could be made to work by moving MirrorCopyMode (and
> JobChangeOptionsMirror, JobInfoMirror) to job.json or some place that
> can be included by both job.json and block-core.json. Moving the
> type-specific definitions to the general job.json didn't feel right to
> me. Including another file with type-specific definitions in job.json
> feels slightly less wrong, but still not quite right and I didn't want
> to create a new file just for MirrorCopyMode (and
> JobChangeOptionsMirror, JobInfoMirror).
> 
> And going further and moving all mirror-related things to a separate
> file would require moving along things like NewImageMode with it or
> create yet another file for such general things used by multiple block-jobs.
> 
> If preferred, I can try and go with some version of the above.
> 

OK, I see the problem. Seems, that all requires some good refactoring. But that's a preexisting big work, and should not hold up your series. I'm OK to proceed with block-job-change.
Markus Armbruster Nov. 3, 2023, 9:36 a.m. UTC | #5
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:

> On 11.10.23 13:18, Fiona Ebner wrote:
>> Am 10.10.23 um 19:55 schrieb Vladimir Sementsov-Ogievskiy:
>>> On 09.10.23 12:46, Fiona Ebner wrote:
>>>>
>>>> Initially, I tried to go for a more general 'job-change' command, but
>>>> I couldn't figure out a way to avoid mutual inclusion between
>>>> block-core.json and job.json.
>>>>
>>>
>>> What is the problem with it? I still think that job-change would be better.
>>>
>> If going for job-change in job.json, the dependencies would be
>> job-change -> JobChangeOptions -> JobChangeOptionsMirror -> MirrorCopyMode
>> query-jobs -> JobInfo -> JobInfoMirror
>> and we can't include block-core.json in job.json, because an inclusion
>> loop gives a build error.

Let me try to understand this.

Command job-change needs its argument type JobChangeOptions.

JobChangeOptions is a union, and JobChangeOptionsMirror is one of its
branches.

JobChangeOptionsMirror needs MirrorCopyMode from block-core.json.

block-core.json needs job.json for JobType and JobStatus.

>> Could be made to work by moving MirrorCopyMode (and
>> JobChangeOptionsMirror, JobInfoMirror) to job.json or some place that
>> can be included by both job.json and block-core.json. Moving the
>> type-specific definitions to the general job.json didn't feel right to
>> me. Including another file with type-specific definitions in job.json
>> feels slightly less wrong, but still not quite right and I didn't want
>> to create a new file just for MirrorCopyMode (and
>> JobChangeOptionsMirror, JobInfoMirror).
>> And going further and moving all mirror-related things to a separate
>> file would require moving along things like NewImageMode with it or
>> create yet another file for such general things used by multiple block-jobs.
>> If preferred, I can try and go with some version of the above.
>> 
>
> OK, I see the problem. Seems, that all requires some good refactoring. But that's a preexisting big work, and should not hold up your series. I'm OK to proceed with block-job-change.

Saving ourselves some internal refactoring is a poor excuse for
undesirable external interfaces.  We need to answer two questions before
we do that:

1. How much work would the refactoring be?

2. Is the interface improvement this enables worth the work?

Let's start with 1.

An obvious solution is to split JobType and JobStatus off job.json to
break the dependency of block-core.json on job.json.

But I'd like us to investigate another one.  block-core.json is *huge*.
It's almost a quarter of the entire QAPI schema.  Can we spin out block
jobs into block-job.json?  Moves the dependency on job.json from
block-core.json to block-job.json.

Thoughts?
Kevin Wolf Nov. 3, 2023, 11:54 a.m. UTC | #6
Am 03.11.2023 um 10:36 hat Markus Armbruster geschrieben:
> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
> 
> > On 11.10.23 13:18, Fiona Ebner wrote:
> >> Am 10.10.23 um 19:55 schrieb Vladimir Sementsov-Ogievskiy:
> >>> On 09.10.23 12:46, Fiona Ebner wrote:
> >>>>
> >>>> Initially, I tried to go for a more general 'job-change' command, but
> >>>> I couldn't figure out a way to avoid mutual inclusion between
> >>>> block-core.json and job.json.
> >>>>
> >>>
> >>> What is the problem with it? I still think that job-change would be better.
> >>>
> >> If going for job-change in job.json, the dependencies would be
> >> job-change -> JobChangeOptions -> JobChangeOptionsMirror -> MirrorCopyMode
> >> query-jobs -> JobInfo -> JobInfoMirror
> >> and we can't include block-core.json in job.json, because an inclusion
> >> loop gives a build error.
> 
> Let me try to understand this.
> 
> Command job-change needs its argument type JobChangeOptions.
> 
> JobChangeOptions is a union, and JobChangeOptionsMirror is one of its
> branches.
> 
> JobChangeOptionsMirror needs MirrorCopyMode from block-core.json.
> 
> block-core.json needs job.json for JobType and JobStatus.
> 
> >> Could be made to work by moving MirrorCopyMode (and
> >> JobChangeOptionsMirror, JobInfoMirror) to job.json or some place that
> >> can be included by both job.json and block-core.json. Moving the
> >> type-specific definitions to the general job.json didn't feel right to
> >> me. Including another file with type-specific definitions in job.json
> >> feels slightly less wrong, but still not quite right and I didn't want
> >> to create a new file just for MirrorCopyMode (and
> >> JobChangeOptionsMirror, JobInfoMirror).
> >> And going further and moving all mirror-related things to a separate
> >> file would require moving along things like NewImageMode with it or
> >> create yet another file for such general things used by multiple block-jobs.
> >> If preferred, I can try and go with some version of the above.
> >> 
> >
> > OK, I see the problem. Seems, that all requires some good refactoring. But that's a preexisting big work, and should not hold up your series. I'm OK to proceed with block-job-change.
> 
> Saving ourselves some internal refactoring is a poor excuse for
> undesirable external interfaces.

I'm not sure how undesirable it is. We have block-job-* commands for
pretty much every other operation, so it's only consistent to have
block-job-change, too.

Having job-change, too, might be nice in theory, but we don't have even
a potential user for it at this point (i.e. a job type that isn't a
block job, but for which changing options at runtime makes sense).

> We need to answer two questions before we do that:
> 
> 1. How much work would the refactoring be?
> 
> 2. Is the interface improvement this enables worth the work?
> 
> Let's start with 1.
> 
> An obvious solution is to split JobType and JobStatus off job.json to
> break the dependency of block-core.json on job.json.
> 
> But I'd like us to investigate another one.  block-core.json is *huge*.
> It's almost a quarter of the entire QAPI schema.  Can we spin out block
> jobs into block-job.json?  Moves the dependency on job.json from
> block-core.json to block-job.json.

It also makes job.json depend on block-job.json instead of
block-core.json, so you only moved the problem without solving it.

Kevin
Markus Armbruster Nov. 3, 2023, 3:56 p.m. UTC | #7
Kevin Wolf <kwolf@redhat.com> writes:

> Am 03.11.2023 um 10:36 hat Markus Armbruster geschrieben:
>> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
>> 
>> > On 11.10.23 13:18, Fiona Ebner wrote:
>> >> Am 10.10.23 um 19:55 schrieb Vladimir Sementsov-Ogievskiy:
>> >>> On 09.10.23 12:46, Fiona Ebner wrote:
>> >>>>
>> >>>> Initially, I tried to go for a more general 'job-change' command, but
>> >>>> I couldn't figure out a way to avoid mutual inclusion between
>> >>>> block-core.json and job.json.
>> >>>>
>> >>>
>> >>> What is the problem with it? I still think that job-change would be better.
>> >>>
>> >> If going for job-change in job.json, the dependencies would be
>> >> job-change -> JobChangeOptions -> JobChangeOptionsMirror -> MirrorCopyMode
>> >> query-jobs -> JobInfo -> JobInfoMirror
>> >> and we can't include block-core.json in job.json, because an inclusion
>> >> loop gives a build error.
>> 
>> Let me try to understand this.
>> 
>> Command job-change needs its argument type JobChangeOptions.
>> 
>> JobChangeOptions is a union, and JobChangeOptionsMirror is one of its
>> branches.
>> 
>> JobChangeOptionsMirror needs MirrorCopyMode from block-core.json.
>> 
>> block-core.json needs job.json for JobType and JobStatus.
>> 
>> >> Could be made to work by moving MirrorCopyMode (and
>> >> JobChangeOptionsMirror, JobInfoMirror) to job.json or some place that
>> >> can be included by both job.json and block-core.json. Moving the
>> >> type-specific definitions to the general job.json didn't feel right to
>> >> me. Including another file with type-specific definitions in job.json
>> >> feels slightly less wrong, but still not quite right and I didn't want
>> >> to create a new file just for MirrorCopyMode (and
>> >> JobChangeOptionsMirror, JobInfoMirror).
>> >> And going further and moving all mirror-related things to a separate
>> >> file would require moving along things like NewImageMode with it or
>> >> create yet another file for such general things used by multiple block-jobs.
>> >> If preferred, I can try and go with some version of the above.
>> >> 
>> >
>> > OK, I see the problem. Seems, that all requires some good refactoring. But that's a preexisting big work, and should not hold up your series. I'm OK to proceed with block-job-change.
>> 
>> Saving ourselves some internal refactoring is a poor excuse for
>> undesirable external interfaces.
>
> I'm not sure how undesirable it is. We have block-job-* commands for
> pretty much every other operation, so it's only consistent to have
> block-job-change, too.

Is the job abstraction a failure?

We have

    block-job- command      since   job- command    since
    -----------------------------------------------------
    block-job-set-speed     1.1
    block-job-cancel        1.1     job-cancel      3.0
    block-job-pause         1.3     job-pause       3.0
    block-job-resume        1.3     job-resume      3.0
    block-job-complete      1.3     job-complete    3.0
    block-job-dismiss       2.12    job-dismiss     3.0
    block-job-finalize      2.12    job-finalize    3.0
    block-job-change        8.2
    query-block-jobs        1.1     query-jobs

I was under the impression that we added the (more general) job-
commands to replace the (less general) block-job commands, and we're
keeping the latter just for compatibility.  Am I mistaken?

Which one should be used?

Why not deprecate the one that shouldn't be used?

The addition of block-job-change without even trying to do job-change
makes me wonder: have we given up on the job- interface?

I'm okay with giving up on failures.  All I want is clarity.  Right now,
I feel thoroughly confused about the status block-jobs and jobs, and how
they're related.

> Having job-change, too, might be nice in theory, but we don't have even
> a potential user for it at this point (i.e. a job type that isn't a
> block job, but for which changing options at runtime makes sense).
>
>> We need to answer two questions before we do that:
>> 
>> 1. How much work would the refactoring be?
>> 
>> 2. Is the interface improvement this enables worth the work?
>> 
>> Let's start with 1.
>> 
>> An obvious solution is to split JobType and JobStatus off job.json to
>> break the dependency of block-core.json on job.json.
>> 
>> But I'd like us to investigate another one.  block-core.json is *huge*.
>> It's almost a quarter of the entire QAPI schema.  Can we spin out block
>> jobs into block-job.json?  Moves the dependency on job.json from
>> block-core.json to block-job.json.
>
> It also makes job.json depend on block-job.json instead of
> block-core.json, so you only moved the problem without solving it.

block-job.json needs block-core.json and job.json.

job.json needs block-core.json.

No circle so far.
Vladimir Sementsov-Ogievskiy Feb. 28, 2024, 6:07 p.m. UTC | #8
On 03.11.23 18:56, Markus Armbruster wrote:
> Kevin Wolf<kwolf@redhat.com>  writes:
> 
>> Am 03.11.2023 um 10:36 hat Markus Armbruster geschrieben:
>>> Vladimir Sementsov-Ogievskiy<vsementsov@yandex-team.ru>  writes:
>>>
>>>> On 11.10.23 13:18, Fiona Ebner wrote:
>>>>> Am 10.10.23 um 19:55 schrieb Vladimir Sementsov-Ogievskiy:
>>>>>> On 09.10.23 12:46, Fiona Ebner wrote:
>>>>>>> Initially, I tried to go for a more general 'job-change' command, but
>>>>>>> I couldn't figure out a way to avoid mutual inclusion between
>>>>>>> block-core.json and job.json.
>>>>>>>
>>>>>> What is the problem with it? I still think that job-change would be better.
>>>>>>
>>>>> If going for job-change in job.json, the dependencies would be
>>>>> job-change -> JobChangeOptions -> JobChangeOptionsMirror -> MirrorCopyMode
>>>>> query-jobs -> JobInfo -> JobInfoMirror
>>>>> and we can't include block-core.json in job.json, because an inclusion
>>>>> loop gives a build error.
>>> Let me try to understand this.
>>>
>>> Command job-change needs its argument type JobChangeOptions.
>>>
>>> JobChangeOptions is a union, and JobChangeOptionsMirror is one of its
>>> branches.
>>>
>>> JobChangeOptionsMirror needs MirrorCopyMode from block-core.json.
>>>
>>> block-core.json needs job.json for JobType and JobStatus.
>>>
>>>>> Could be made to work by moving MirrorCopyMode (and
>>>>> JobChangeOptionsMirror, JobInfoMirror) to job.json or some place that
>>>>> can be included by both job.json and block-core.json. Moving the
>>>>> type-specific definitions to the general job.json didn't feel right to
>>>>> me. Including another file with type-specific definitions in job.json
>>>>> feels slightly less wrong, but still not quite right and I didn't want
>>>>> to create a new file just for MirrorCopyMode (and
>>>>> JobChangeOptionsMirror, JobInfoMirror).
>>>>> And going further and moving all mirror-related things to a separate
>>>>> file would require moving along things like NewImageMode with it or
>>>>> create yet another file for such general things used by multiple block-jobs.
>>>>> If preferred, I can try and go with some version of the above.
>>>>>
>>>> OK, I see the problem. Seems, that all requires some good refactoring. But that's a preexisting big work, and should not hold up your series. I'm OK to proceed with block-job-change.
>>> Saving ourselves some internal refactoring is a poor excuse for
>>> undesirable external interfaces.
>> I'm not sure how undesirable it is. We have block-job-* commands for
>> pretty much every other operation, so it's only consistent to have
>> block-job-change, too.
> Is the job abstraction a failure?
> 
> We have
> 
>      block-job- command      since   job- command    since
>      -----------------------------------------------------
>      block-job-set-speed     1.1
>      block-job-cancel        1.1     job-cancel      3.0
>      block-job-pause         1.3     job-pause       3.0
>      block-job-resume        1.3     job-resume      3.0
>      block-job-complete      1.3     job-complete    3.0
>      block-job-dismiss       2.12    job-dismiss     3.0
>      block-job-finalize      2.12    job-finalize    3.0
>      block-job-change        8.2
>      query-block-jobs        1.1     query-jobs
> 
> I was under the impression that we added the (more general) job-
> commands to replace the (less general) block-job commands, and we're
> keeping the latter just for compatibility.  Am I mistaken?
> 
> Which one should be used?
> 
> Why not deprecate the one that shouldn't be used?
> 
> The addition of block-job-change without even trying to do job-change
> makes me wonder: have we given up on the job- interface?
> 
> I'm okay with giving up on failures.  All I want is clarity.  Right now,
> I feel thoroughly confused about the status block-jobs and jobs, and how
> they're related.

Hi! I didn't notice, that the series was finally merged.

About the APIs, I think, of course we should deprecate block-job-* API, because we already have jobs which are not block-jobs, so we can't deprecate job-* API.

So I suggest a plan:

1. Add job-change command simply in block-core.json, as a simple copy of block-job-change, to not care with resolving inclusion loops. (ha we could simply name our block-job-change to be job-change and place it in block-core.json, but now is too late)

2. Support changing speed in a new job-chage command. (or both in block-job-change and job-change, keeping them equal)

3. Deprecate block-job-* APIs

4. Wait 3 releases

5. Drop block-job-* APIs

6. Move all job-related stuff to job.json, drop `{ 'include': 'job.json' }` from block-core.json, and instead include block-core.json into job.json

If it's OK, I can go through the steps.
Markus Armbruster Feb. 29, 2024, 5:28 a.m. UTC | #9
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:

> On 03.11.23 18:56, Markus Armbruster wrote:
>> Is the job abstraction a failure?
>>
>> We have
>>
>>      block-job- command      since   job- command    since
>>      -----------------------------------------------------
>>      block-job-set-speed     1.1
>>      block-job-cancel        1.1     job-cancel      3.0
>>      block-job-pause         1.3     job-pause       3.0
>>      block-job-resume        1.3     job-resume      3.0
>>      block-job-complete      1.3     job-complete    3.0
>>      block-job-dismiss       2.12    job-dismiss     3.0
>>      block-job-finalize      2.12    job-finalize    3.0
>>      block-job-change        8.2
>>      query-block-jobs        1.1     query-jobs
>>
>> I was under the impression that we added the (more general) job-
>> commands to replace the (less general) block-job commands, and we're
>> keeping the latter just for compatibility.  Am I mistaken?
>>
>> Which one should be used?
>>
>> Why not deprecate the one that shouldn't be used?
>>
>> The addition of block-job-change without even trying to do job-change
>> makes me wonder: have we given up on the job- interface?
>>
>> I'm okay with giving up on failures.  All I want is clarity.  Right now,
>> I feel thoroughly confused about the status block-jobs and jobs, and how
>> they're related.
>
> Hi! I didn't notice, that the series was finally merged.
>
> About the APIs, I think, of course we should deprecate block-job-* API, because we already have jobs which are not block-jobs, so we can't deprecate job-* API.
>
> So I suggest a plan:
>
> 1. Add job-change command simply in block-core.json, as a simple copy of block-job-change, to not care with resolving inclusion loops. (ha we could simply name our block-job-change to be job-change and place it in block-core.json, but now is too late)
>
> 2. Support changing speed in a new job-chage command. (or both in block-job-change and job-change, keeping them equal)
>
> 3. Deprecate block-job-* APIs
>
> 4. Wait 3 releases
>
> 5. Drop block-job-* APIs
>
> 6. Move all job-related stuff to job.json, drop `{ 'include': 'job.json' }` from block-core.json, and instead include block-core.json into job.json

Sounds good to me.

> If it's OK, I can go through the steps.

Lovely!
Kevin Wolf March 4, 2024, 10:48 a.m. UTC | #10
Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> On 03.11.23 18:56, Markus Armbruster wrote:
> > Kevin Wolf<kwolf@redhat.com>  writes:
> > 
> > > Am 03.11.2023 um 10:36 hat Markus Armbruster geschrieben:
> > > > Vladimir Sementsov-Ogievskiy<vsementsov@yandex-team.ru>  writes:
> > > > 
> > > > > On 11.10.23 13:18, Fiona Ebner wrote:
> > > > > > Am 10.10.23 um 19:55 schrieb Vladimir Sementsov-Ogievskiy:
> > > > > > > On 09.10.23 12:46, Fiona Ebner wrote:
> > > > > > > > Initially, I tried to go for a more general 'job-change' command, but
> > > > > > > > I couldn't figure out a way to avoid mutual inclusion between
> > > > > > > > block-core.json and job.json.
> > > > > > > > 
> > > > > > > What is the problem with it? I still think that job-change would be better.
> > > > > > > 
> > > > > > If going for job-change in job.json, the dependencies would be
> > > > > > job-change -> JobChangeOptions -> JobChangeOptionsMirror -> MirrorCopyMode
> > > > > > query-jobs -> JobInfo -> JobInfoMirror
> > > > > > and we can't include block-core.json in job.json, because an inclusion
> > > > > > loop gives a build error.
> > > > Let me try to understand this.
> > > > 
> > > > Command job-change needs its argument type JobChangeOptions.
> > > > 
> > > > JobChangeOptions is a union, and JobChangeOptionsMirror is one of its
> > > > branches.
> > > > 
> > > > JobChangeOptionsMirror needs MirrorCopyMode from block-core.json.
> > > > 
> > > > block-core.json needs job.json for JobType and JobStatus.
> > > > 
> > > > > > Could be made to work by moving MirrorCopyMode (and
> > > > > > JobChangeOptionsMirror, JobInfoMirror) to job.json or some place that
> > > > > > can be included by both job.json and block-core.json. Moving the
> > > > > > type-specific definitions to the general job.json didn't feel right to
> > > > > > me. Including another file with type-specific definitions in job.json
> > > > > > feels slightly less wrong, but still not quite right and I didn't want
> > > > > > to create a new file just for MirrorCopyMode (and
> > > > > > JobChangeOptionsMirror, JobInfoMirror).
> > > > > > And going further and moving all mirror-related things to a separate
> > > > > > file would require moving along things like NewImageMode with it or
> > > > > > create yet another file for such general things used by multiple block-jobs.
> > > > > > If preferred, I can try and go with some version of the above.
> > > > > > 
> > > > > OK, I see the problem. Seems, that all requires some good refactoring. But that's a preexisting big work, and should not hold up your series. I'm OK to proceed with block-job-change.
> > > > Saving ourselves some internal refactoring is a poor excuse for
> > > > undesirable external interfaces.
> > > I'm not sure how undesirable it is. We have block-job-* commands for
> > > pretty much every other operation, so it's only consistent to have
> > > block-job-change, too.
> > Is the job abstraction a failure?
> > 
> > We have
> > 
> >      block-job- command      since   job- command    since
> >      -----------------------------------------------------
> >      block-job-set-speed     1.1
> >      block-job-cancel        1.1     job-cancel      3.0
> >      block-job-pause         1.3     job-pause       3.0
> >      block-job-resume        1.3     job-resume      3.0
> >      block-job-complete      1.3     job-complete    3.0
> >      block-job-dismiss       2.12    job-dismiss     3.0
> >      block-job-finalize      2.12    job-finalize    3.0
> >      block-job-change        8.2
> >      query-block-jobs        1.1     query-jobs
> > 
> > I was under the impression that we added the (more general) job-
> > commands to replace the (less general) block-job commands, and we're
> > keeping the latter just for compatibility.  Am I mistaken?
> > 
> > Which one should be used?
> > 
> > Why not deprecate the one that shouldn't be used?
> > 
> > The addition of block-job-change without even trying to do job-change
> > makes me wonder: have we given up on the job- interface?
> > 
> > I'm okay with giving up on failures.  All I want is clarity.  Right now,
> > I feel thoroughly confused about the status block-jobs and jobs, and how
> > they're related.
> 
> Hi! I didn't notice, that the series was finally merged.
> 
> About the APIs, I think, of course we should deprecate block-job-* API, because we already have jobs which are not block-jobs, so we can't deprecate job-* API.
> 
> So I suggest a plan:
> 
> 1. Add job-change command simply in block-core.json, as a simple copy
>    of block-job-change, to not care with resolving inclusion loops.
>    (ha we could simply name our block-job-change to be job-change and
>    place it in block-core.json, but now is too late)
> 
> 2. Support changing speed in a new job-chage command. (or both in
>    block-job-change and job-change, keeping them equal)

It should be both block-job-change and job-change.

Having job-change in block-core.json rather than job.json is ugly, but
if Markus doesn't complain, why would I.

> 3. Deprecate block-job-* APIs
> 
> 4. Wait 3 releases
> 
> 5. Drop block-job-* APIs

I consider these strictly optional. We don't really have strong reasons
to deprecate these commands (they are just thin wrappers), and I think
libvirt still uses block-job-* in some places.

We also need to check if the interfaces are really the same. For
example, JobInfo is only a small subset of BlockJobInfo. Some things
could be added to JobInfo, other things like BlockDeviceIoStatus don't
really have a place there, so we would have to introduce job type
specific data in query-jobs first.

I'm sure it's all doable, but it might be more work than your list above
would make you think.

> 6. Move all job-related stuff to job.json, drop `{ 'include':
>    'job.json' }` from block-core.json, and instead include
>    block-core.json into job.json

Of course, this cleanup assumes that steps 3.-5. are really implemented.
If not, you would end up moving a lot more block related things to
job.json than after them.

Kevin
Peter Krempa March 4, 2024, 11:09 a.m. UTC | #11
On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
> Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > On 03.11.23 18:56, Markus Armbruster wrote:
> > > Kevin Wolf<kwolf@redhat.com>  writes:

[...]

> > > Is the job abstraction a failure?
> > > 
> > > We have
> > > 
> > >      block-job- command      since   job- command    since
> > >      -----------------------------------------------------
> > >      block-job-set-speed     1.1
> > >      block-job-cancel        1.1     job-cancel      3.0
> > >      block-job-pause         1.3     job-pause       3.0
> > >      block-job-resume        1.3     job-resume      3.0
> > >      block-job-complete      1.3     job-complete    3.0
> > >      block-job-dismiss       2.12    job-dismiss     3.0
> > >      block-job-finalize      2.12    job-finalize    3.0
> > >      block-job-change        8.2
> > >      query-block-jobs        1.1     query-jobs

[...]

> I consider these strictly optional. We don't really have strong reasons
> to deprecate these commands (they are just thin wrappers), and I think
> libvirt still uses block-job-* in some places.

Libvirt uses 'block-job-cancel' because it has different semantics from
'job-cancel' which libvirt documented as the behaviour of the API that
uses it. (Semantics regarding the expectation of what is written to the
destination node at the point when the job is cancelled).

Libvirt also uses 'block-job-set-speed' and 'query-block-jobs' but those
can be replaced easily and looking at the above table even without any
feature checks.

Thus the plan to deprecate at least 'block-job-cancel' will not work
unless the semantics are ported into 'job-cancel'.
Markus Armbruster March 4, 2024, 12:27 p.m. UTC | #12
Kevin Wolf <kwolf@redhat.com> writes:

> Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:

[...]

>> About the APIs, I think, of course we should deprecate block-job-* API, because we already have jobs which are not block-jobs, so we can't deprecate job-* API.
>> 
>> So I suggest a plan:
>> 
>> 1. Add job-change command simply in block-core.json, as a simple copy
>>    of block-job-change, to not care with resolving inclusion loops.
>>    (ha we could simply name our block-job-change to be job-change and
>>    place it in block-core.json, but now is too late)
>> 
>> 2. Support changing speed in a new job-chage command. (or both in
>>    block-job-change and job-change, keeping them equal)
>
> It should be both block-job-change and job-change.
>
> Having job-change in block-core.json rather than job.json is ugly, but
> if Markus doesn't complain, why would I.

What we have is ugly and confusing: two interfaces with insufficient
guidance on which one to use.

Unifying the interfaces will reduce confusion immediately, and may
reduce ugliness eventually.

I take it.

[...]
Vladimir Sementsov-Ogievskiy March 7, 2024, 7:42 p.m. UTC | #13
On 04.03.24 14:09, Peter Krempa wrote:
> On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
>> Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>> On 03.11.23 18:56, Markus Armbruster wrote:
>>>> Kevin Wolf<kwolf@redhat.com>  writes:
> 
> [...]
> 
>>>> Is the job abstraction a failure?
>>>>
>>>> We have
>>>>
>>>>       block-job- command      since   job- command    since
>>>>       -----------------------------------------------------
>>>>       block-job-set-speed     1.1
>>>>       block-job-cancel        1.1     job-cancel      3.0
>>>>       block-job-pause         1.3     job-pause       3.0
>>>>       block-job-resume        1.3     job-resume      3.0
>>>>       block-job-complete      1.3     job-complete    3.0
>>>>       block-job-dismiss       2.12    job-dismiss     3.0
>>>>       block-job-finalize      2.12    job-finalize    3.0
>>>>       block-job-change        8.2
>>>>       query-block-jobs        1.1     query-jobs
> 
> [...]
> 
>> I consider these strictly optional. We don't really have strong reasons
>> to deprecate these commands (they are just thin wrappers), and I think
>> libvirt still uses block-job-* in some places.
> 
> Libvirt uses 'block-job-cancel' because it has different semantics from
> 'job-cancel' which libvirt documented as the behaviour of the API that
> uses it. (Semantics regarding the expectation of what is written to the
> destination node at the point when the job is cancelled).
> 

That's the following semantics:

   # Note that if you issue 'block-job-cancel' after 'drive-mirror' has
   # indicated (via the event BLOCK_JOB_READY) that the source and
   # destination are synchronized, then the event triggered by this
   # command changes to BLOCK_JOB_COMPLETED, to indicate that the
   # mirroring has ended and the destination now has a point-in-time copy
   # tied to the time of the cancellation.

Hmm. Looking at this, it looks for me, that should probably a 'block-job-complete" command (as leading to BLOCK_JOB_COMPLETED).

Actually, what is the difference between block-job-complete and block-job-cancel(force=false) for mirror in ready state?

I only see the following differencies:

1. block-job-complete documents that it completes the job synchronously.. But looking at mirror code I see it just set s->should_complete = true, which will be then handled asynchronously.. So I doubt that documentation is correct.

2. block-job-complete will trigger final graph changes. block-job-cancel will not.

Is [2] really useful? Seems yes: in case of some failure before starting migration target, we'd like to continue executing source. So, no reason to break block-graph in source, better keep it unchanged.

But I think, such behavior better be setup by mirror-job start parameter, rather then by special option for cancel (or even compelete) command, useful only for mirror.

So, what about the following substitution for block-job-cancel:

block-job-cancel(force=true)  -->  use job-cancel

block-job-cancel(force=false) for backup, stream, commit  -->  use job-cancel

block-job-cancel(force=false) for mirror in ready mode  -->

   instead, use block-job-complete. If you don't need final graph modification which mirror job normally does, use graph-change=false parameter for blockdev-mirror command.


(I can hardly remember, that we've already discussed something like this long time ago, but I don't remember the results)

I also a bit unsure about active commit soft-cancelling semantics. Is it actually useful? If yes, block-commit command will need similar option.
Fiona Ebner March 8, 2024, 8:21 a.m. UTC | #14
Am 07.03.24 um 20:42 schrieb Vladimir Sementsov-Ogievskiy:
> On 04.03.24 14:09, Peter Krempa wrote:
>> On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
>>> Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> On 03.11.23 18:56, Markus Armbruster wrote:
>>>>> Kevin Wolf<kwolf@redhat.com>  writes:
>>
>> [...]
>>
>>>>> Is the job abstraction a failure?
>>>>>
>>>>> We have
>>>>>
>>>>>       block-job- command      since   job- command    since
>>>>>       -----------------------------------------------------
>>>>>       block-job-set-speed     1.1
>>>>>       block-job-cancel        1.1     job-cancel      3.0
>>>>>       block-job-pause         1.3     job-pause       3.0
>>>>>       block-job-resume        1.3     job-resume      3.0
>>>>>       block-job-complete      1.3     job-complete    3.0
>>>>>       block-job-dismiss       2.12    job-dismiss     3.0
>>>>>       block-job-finalize      2.12    job-finalize    3.0
>>>>>       block-job-change        8.2
>>>>>       query-block-jobs        1.1     query-jobs
>>
>> [...]
>>
>>> I consider these strictly optional. We don't really have strong reasons
>>> to deprecate these commands (they are just thin wrappers), and I think
>>> libvirt still uses block-job-* in some places.
>>
>> Libvirt uses 'block-job-cancel' because it has different semantics from
>> 'job-cancel' which libvirt documented as the behaviour of the API that
>> uses it. (Semantics regarding the expectation of what is written to the
>> destination node at the point when the job is cancelled).
>>
> 
> That's the following semantics:
> 
>   # Note that if you issue 'block-job-cancel' after 'drive-mirror' has
>   # indicated (via the event BLOCK_JOB_READY) that the source and
>   # destination are synchronized, then the event triggered by this
>   # command changes to BLOCK_JOB_COMPLETED, to indicate that the
>   # mirroring has ended and the destination now has a point-in-time copy
>   # tied to the time of the cancellation.
> 
> Hmm. Looking at this, it looks for me, that should probably a
> 'block-job-complete" command (as leading to BLOCK_JOB_COMPLETED).
> 
> Actually, what is the difference between block-job-complete and
> block-job-cancel(force=false) for mirror in ready state?
> 
> I only see the following differencies:
> 
> 1. block-job-complete documents that it completes the job
> synchronously.. But looking at mirror code I see it just set
> s->should_complete = true, which will be then handled asynchronously..
> So I doubt that documentation is correct.
> 
> 2. block-job-complete will trigger final graph changes. block-job-cancel
> will not.
> 
> Is [2] really useful? Seems yes: in case of some failure before starting
> migration target, we'd like to continue executing source. So, no reason
> to break block-graph in source, better keep it unchanged.
> 

FWIW, we also rely on these special semantics. We allow cloning the disk
state of a running guest using drive-mirror (and before finishing,
fsfreeze in the guest for consistency). We cannot use block-job-complete
there, because we do not want to switch the source's drive.

> But I think, such behavior better be setup by mirror-job start
> parameter, rather then by special option for cancel (or even compelete)
> command, useful only for mirror.
> 
> So, what about the following substitution for block-job-cancel:
> 
> block-job-cancel(force=true)  -->  use job-cancel
> 
> block-job-cancel(force=false) for backup, stream, commit  -->  use
> job-cancel
> 
> block-job-cancel(force=false) for mirror in ready mode  -->
> 
>   instead, use block-job-complete. If you don't need final graph
> modification which mirror job normally does, use graph-change=false
> parameter for blockdev-mirror command.
> 

But yes, having a graph-change parameter would work for us too :)

Best Regards,
Fiona
Kevin Wolf March 8, 2024, 8:52 a.m. UTC | #15
Am 07.03.2024 um 20:42 hat Vladimir Sementsov-Ogievskiy geschrieben:
> On 04.03.24 14:09, Peter Krempa wrote:
> > On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
> > > Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > On 03.11.23 18:56, Markus Armbruster wrote:
> > > > > Kevin Wolf<kwolf@redhat.com>  writes:
> > 
> > [...]
> > 
> > > > > Is the job abstraction a failure?
> > > > > 
> > > > > We have
> > > > > 
> > > > >       block-job- command      since   job- command    since
> > > > >       -----------------------------------------------------
> > > > >       block-job-set-speed     1.1
> > > > >       block-job-cancel        1.1     job-cancel      3.0
> > > > >       block-job-pause         1.3     job-pause       3.0
> > > > >       block-job-resume        1.3     job-resume      3.0
> > > > >       block-job-complete      1.3     job-complete    3.0
> > > > >       block-job-dismiss       2.12    job-dismiss     3.0
> > > > >       block-job-finalize      2.12    job-finalize    3.0
> > > > >       block-job-change        8.2
> > > > >       query-block-jobs        1.1     query-jobs
> > 
> > [...]
> > 
> > > I consider these strictly optional. We don't really have strong reasons
> > > to deprecate these commands (they are just thin wrappers), and I think
> > > libvirt still uses block-job-* in some places.
> > 
> > Libvirt uses 'block-job-cancel' because it has different semantics from
> > 'job-cancel' which libvirt documented as the behaviour of the API that
> > uses it. (Semantics regarding the expectation of what is written to the
> > destination node at the point when the job is cancelled).
> > 
> 
> That's the following semantics:
> 
>   # Note that if you issue 'block-job-cancel' after 'drive-mirror' has
>   # indicated (via the event BLOCK_JOB_READY) that the source and
>   # destination are synchronized, then the event triggered by this
>   # command changes to BLOCK_JOB_COMPLETED, to indicate that the
>   # mirroring has ended and the destination now has a point-in-time copy
>   # tied to the time of the cancellation.
> 
> Hmm. Looking at this, it looks for me, that should probably a
> 'block-job-complete" command (as leading to BLOCK_JOB_COMPLETED).

Yes, it's just a different completion mode.

> Actually, what is the difference between block-job-complete and
> block-job-cancel(force=false) for mirror in ready state?
> 
> I only see the following differencies:
> 
> 1. block-job-complete documents that it completes the job
>    synchronously.. But looking at mirror code I see it just set
>    s->should_complete = true, which will be then handled
>    asynchronously..  So I doubt that documentation is correct.
> 
> 2. block-job-complete will trigger final graph changes.
>    block-job-cancel will not.
> 
> Is [2] really useful? Seems yes: in case of some failure before
> starting migration target, we'd like to continue executing source. So,
> no reason to break block-graph in source, better keep it unchanged.
> 
> But I think, such behavior better be setup by mirror-job start
> parameter, rather then by special option for cancel (or even
> compelete) command, useful only for mirror.

I'm not sure, having the option on the complete command makes more sense
to me than having it in blockdev-mirror.

I do see the challenge of representing this meaningfully in QAPI,
though. Semantically it should be a union with job-specific options and
only mirror adds the graph-changes option. But the union variant
can't be directly selected from another option - instead we have a job
ID, and the variant is the job type of the job with this ID.

Practically speaking, we would probably indeed end up with an optional
field in the existing completion command.

> So, what about the following substitution for block-job-cancel:
> 
> block-job-cancel(force=true)  -->  use job-cancel
> 
> block-job-cancel(force=false) for backup, stream, commit  -->  use job-cancel
> 
> block-job-cancel(force=false) for mirror in ready mode  -->
> 
>   instead, use block-job-complete. If you don't need final graph
>   modification which mirror job normally does, use graph-change=false
>   parameter for blockdev-mirror command.

Apart from the open question where to put the option, agreed.

> (I can hardly remember, that we've already discussed something like
> this long time ago, but I don't remember the results)

I think everyone agreed that this is how things should be, and nobody
did anything to achieve it.

> I also a bit unsure about active commit soft-cancelling semantics. Is
> it actually useful? If yes, block-commit command will need similar
> option.

Hm... That would commit everything down to the lower layer and then keep
the old overlays still around?

I could see a limited use case for committing into the immediate backing
file and then keep using the overlay to accumulate new changes (would be
more useful if we discarded the old content). Once you have intermediate
backing files, I don't think it makes any sense any more.

Kevin
Peter Krempa March 10, 2024, 9:07 p.m. UTC | #16
On Thu, Mar 07, 2024 at 22:42:56 +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 04.03.24 14:09, Peter Krempa wrote:
> > On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
> > > Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > On 03.11.23 18:56, Markus Armbruster wrote:
> > > > > Kevin Wolf<kwolf@redhat.com>  writes:

[...]

> 
> I only see the following differencies:
> 
> 1. block-job-complete documents that it completes the job synchronously.. But looking at mirror code I see it just set s->should_complete = true, which will be then handled asynchronously.. So I doubt that documentation is correct.
> 
> 2. block-job-complete will trigger final graph changes. block-job-cancel will not.
> 
> Is [2] really useful? Seems yes: in case of some failure before starting migration target, we'd like to continue executing source. So, no reason to break block-graph in source, better keep it unchanged.
> 
> But I think, such behavior better be setup by mirror-job start parameter, rather then by special option for cancel (or even compelete) command, useful only for mirror.

Libvirt's API design was based on the qemu quirk and thus we allow users
to do the decision after starting the job, thus anything you pick needs
to allow us to do this at "completion" time.

Libvirt can adapt to any option that will give us the above semantics
(extra parameter at completion time, different completion command or
extra command to switch job properties right before completion), but to
be honest all of these feel like they would be more hassle than keeping
'block-job-cancel' around from qemu's side.
Vladimir Sementsov-Ogievskiy March 11, 2024, 3:15 p.m. UTC | #17
On 08.03.24 11:52, Kevin Wolf wrote:
> Am 07.03.2024 um 20:42 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> On 04.03.24 14:09, Peter Krempa wrote:
>>> On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
>>>> Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>> On 03.11.23 18:56, Markus Armbruster wrote:
>>>>>> Kevin Wolf<kwolf@redhat.com>  writes:
>>>
>>> [...]
>>>
>>>>>> Is the job abstraction a failure?
>>>>>>
>>>>>> We have
>>>>>>
>>>>>>        block-job- command      since   job- command    since
>>>>>>        -----------------------------------------------------
>>>>>>        block-job-set-speed     1.1
>>>>>>        block-job-cancel        1.1     job-cancel      3.0
>>>>>>        block-job-pause         1.3     job-pause       3.0
>>>>>>        block-job-resume        1.3     job-resume      3.0
>>>>>>        block-job-complete      1.3     job-complete    3.0
>>>>>>        block-job-dismiss       2.12    job-dismiss     3.0
>>>>>>        block-job-finalize      2.12    job-finalize    3.0
>>>>>>        block-job-change        8.2
>>>>>>        query-block-jobs        1.1     query-jobs
>>>
>>> [...]
>>>
>>>> I consider these strictly optional. We don't really have strong reasons
>>>> to deprecate these commands (they are just thin wrappers), and I think
>>>> libvirt still uses block-job-* in some places.
>>>
>>> Libvirt uses 'block-job-cancel' because it has different semantics from
>>> 'job-cancel' which libvirt documented as the behaviour of the API that
>>> uses it. (Semantics regarding the expectation of what is written to the
>>> destination node at the point when the job is cancelled).
>>>
>>
>> That's the following semantics:
>>
>>    # Note that if you issue 'block-job-cancel' after 'drive-mirror' has
>>    # indicated (via the event BLOCK_JOB_READY) that the source and
>>    # destination are synchronized, then the event triggered by this
>>    # command changes to BLOCK_JOB_COMPLETED, to indicate that the
>>    # mirroring has ended and the destination now has a point-in-time copy
>>    # tied to the time of the cancellation.
>>
>> Hmm. Looking at this, it looks for me, that should probably a
>> 'block-job-complete" command (as leading to BLOCK_JOB_COMPLETED).
> 
> Yes, it's just a different completion mode.
> 
>> Actually, what is the difference between block-job-complete and
>> block-job-cancel(force=false) for mirror in ready state?
>>
>> I only see the following differencies:
>>
>> 1. block-job-complete documents that it completes the job
>>     synchronously.. But looking at mirror code I see it just set
>>     s->should_complete = true, which will be then handled
>>     asynchronously..  So I doubt that documentation is correct.
>>
>> 2. block-job-complete will trigger final graph changes.
>>     block-job-cancel will not.
>>
>> Is [2] really useful? Seems yes: in case of some failure before
>> starting migration target, we'd like to continue executing source. So,
>> no reason to break block-graph in source, better keep it unchanged.
>>
>> But I think, such behavior better be setup by mirror-job start
>> parameter, rather then by special option for cancel (or even
>> compelete) command, useful only for mirror.
> 
> I'm not sure, having the option on the complete command makes more sense
> to me than having it in blockdev-mirror.
> 
> I do see the challenge of representing this meaningfully in QAPI,
> though. Semantically it should be a union with job-specific options and
> only mirror adds the graph-changes option. But the union variant
> can't be directly selected from another option - instead we have a job
> ID, and the variant is the job type of the job with this ID.

We already have such command: block-job-change. Which has id and type parameters, so user have to pass both, to identify the job itself and pick corresponding variant of the union type.

That would be good to somehow teach QAPI to get the type automatically from the job itself...


Probably, best way is to utilize the new "change" command?

So, to avoid final graph change, user should either set graph-change=false in blockdev-mirror, or just call job-change(graph-change=false) before job-complete?


Still having the option for job-complete looks nicer. But right, we either have to add type parameter like in block-job-change, or add a common option, which would be irrelevant to some jobs.

> 
> Practically speaking, we would probably indeed end up with an optional
> field in the existing completion command.
> 
>> So, what about the following substitution for block-job-cancel:
>>
>> block-job-cancel(force=true)  -->  use job-cancel
>>
>> block-job-cancel(force=false) for backup, stream, commit  -->  use job-cancel
>>
>> block-job-cancel(force=false) for mirror in ready mode  -->
>>
>>    instead, use block-job-complete. If you don't need final graph
>>    modification which mirror job normally does, use graph-change=false
>>    parameter for blockdev-mirror command.
> 
> Apart from the open question where to put the option, agreed.
> 
>> (I can hardly remember, that we've already discussed something like
>> this long time ago, but I don't remember the results)
> 
> I think everyone agreed that this is how things should be, and nobody
> did anything to achieve it.
> 
>> I also a bit unsure about active commit soft-cancelling semantics. Is
>> it actually useful? If yes, block-commit command will need similar
>> option.
> 
> Hm... That would commit everything down to the lower layer and then keep
> the old overlays still around?
> 
> I could see a limited use case for committing into the immediate backing
> file and then keep using the overlay to accumulate new changes (would be
> more useful if we discarded the old content). Once you have intermediate
> backing files, I don't think it makes any sense any more.
> 
> Kevin
>
Vladimir Sementsov-Ogievskiy March 11, 2024, 3:51 p.m. UTC | #18
On 11.03.24 00:07, Peter Krempa wrote:
> On Thu, Mar 07, 2024 at 22:42:56 +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 04.03.24 14:09, Peter Krempa wrote:
>>> On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
>>>> Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>> On 03.11.23 18:56, Markus Armbruster wrote:
>>>>>> Kevin Wolf<kwolf@redhat.com>  writes:
> 
> [...]
> 
>>
>> I only see the following differencies:
>>
>> 1. block-job-complete documents that it completes the job synchronously.. But looking at mirror code I see it just set s->should_complete = true, which will be then handled asynchronously.. So I doubt that documentation is correct.
>>
>> 2. block-job-complete will trigger final graph changes. block-job-cancel will not.
>>
>> Is [2] really useful? Seems yes: in case of some failure before starting migration target, we'd like to continue executing source. So, no reason to break block-graph in source, better keep it unchanged.
>>
>> But I think, such behavior better be setup by mirror-job start parameter, rather then by special option for cancel (or even compelete) command, useful only for mirror.
> 
> Libvirt's API design was based on the qemu quirk and thus we allow users
> to do the decision after starting the job, thus anything you pick needs
> to allow us to do this at "completion" time.
> 

OK, this means that simply switch to an option at job-start is not enough.

> Libvirt can adapt to any option that will give us the above semantics
> (extra parameter at completion time, different completion command or
> extra command to switch job properties right before completion), but to
> be honest all of these feel like they would be more hassle than keeping
> 'block-job-cancel' around from qemu's side.
> 

I understand. But still, it would be good to finally resolve the duplication between job-* and block-job-* APIs. We can keep old quirk working for a long time even after making a new consistent API.
Peter Krempa March 11, 2024, 4:07 p.m. UTC | #19
On Mon, Mar 11, 2024 at 18:51:18 +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 11.03.24 00:07, Peter Krempa wrote:
> > On Thu, Mar 07, 2024 at 22:42:56 +0300, Vladimir Sementsov-Ogievskiy wrote:

[...]

> > Libvirt can adapt to any option that will give us the above semantics
> > (extra parameter at completion time, different completion command or
> > extra command to switch job properties right before completion), but to
> > be honest all of these feel like they would be more hassle than keeping
> > 'block-job-cancel' around from qemu's side.
> > 
> 
> I understand. But still, it would be good to finally resolve the duplication between job-* and block-job-* APIs. We can keep old quirk working for a long time even after making a new consistent API.

Sure, if you decide to go that way it's okay as long as we can avoid the
graph change at 'completion' time. However you decide to implement it
just let me know in advance so that I can prepare the libvirt patches
for it.
Vladimir Sementsov-Ogievskiy March 12, 2024, 1:44 p.m. UTC | #20
On 11.03.24 18:15, Vladimir Sementsov-Ogievskiy wrote:
> On 08.03.24 11:52, Kevin Wolf wrote:
>> Am 07.03.2024 um 20:42 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>> On 04.03.24 14:09, Peter Krempa wrote:
>>>> On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
>>>>> Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>>> On 03.11.23 18:56, Markus Armbruster wrote:
>>>>>>> Kevin Wolf<kwolf@redhat.com>  writes:
>>>>
>>>> [...]
>>>>
>>>>>>> Is the job abstraction a failure?
>>>>>>>
>>>>>>> We have
>>>>>>>
>>>>>>>        block-job- command      since   job- command    since
>>>>>>>        -----------------------------------------------------
>>>>>>>        block-job-set-speed     1.1
>>>>>>>        block-job-cancel        1.1     job-cancel      3.0
>>>>>>>        block-job-pause         1.3     job-pause       3.0
>>>>>>>        block-job-resume        1.3     job-resume      3.0
>>>>>>>        block-job-complete      1.3     job-complete    3.0
>>>>>>>        block-job-dismiss       2.12    job-dismiss     3.0
>>>>>>>        block-job-finalize      2.12    job-finalize    3.0
>>>>>>>        block-job-change        8.2
>>>>>>>        query-block-jobs        1.1     query-jobs
>>>>
>>>> [...]
>>>>
>>>>> I consider these strictly optional. We don't really have strong reasons
>>>>> to deprecate these commands (they are just thin wrappers), and I think
>>>>> libvirt still uses block-job-* in some places.
>>>>
>>>> Libvirt uses 'block-job-cancel' because it has different semantics from
>>>> 'job-cancel' which libvirt documented as the behaviour of the API that
>>>> uses it. (Semantics regarding the expectation of what is written to the
>>>> destination node at the point when the job is cancelled).
>>>>
>>>
>>> That's the following semantics:
>>>
>>>    # Note that if you issue 'block-job-cancel' after 'drive-mirror' has
>>>    # indicated (via the event BLOCK_JOB_READY) that the source and
>>>    # destination are synchronized, then the event triggered by this
>>>    # command changes to BLOCK_JOB_COMPLETED, to indicate that the
>>>    # mirroring has ended and the destination now has a point-in-time copy
>>>    # tied to the time of the cancellation.
>>>
>>> Hmm. Looking at this, it looks for me, that should probably a
>>> 'block-job-complete" command (as leading to BLOCK_JOB_COMPLETED).
>>
>> Yes, it's just a different completion mode.
>>
>>> Actually, what is the difference between block-job-complete and
>>> block-job-cancel(force=false) for mirror in ready state?
>>>
>>> I only see the following differencies:
>>>
>>> 1. block-job-complete documents that it completes the job
>>>     synchronously.. But looking at mirror code I see it just set
>>>     s->should_complete = true, which will be then handled
>>>     asynchronously..  So I doubt that documentation is correct.
>>>
>>> 2. block-job-complete will trigger final graph changes.
>>>     block-job-cancel will not.
>>>
>>> Is [2] really useful? Seems yes: in case of some failure before
>>> starting migration target, we'd like to continue executing source. So,
>>> no reason to break block-graph in source, better keep it unchanged.
>>>
>>> But I think, such behavior better be setup by mirror-job start
>>> parameter, rather then by special option for cancel (or even
>>> compelete) command, useful only for mirror.
>>
>> I'm not sure, having the option on the complete command makes more sense
>> to me than having it in blockdev-mirror.
>>
>> I do see the challenge of representing this meaningfully in QAPI,
>> though. Semantically it should be a union with job-specific options and
>> only mirror adds the graph-changes option. But the union variant
>> can't be directly selected from another option - instead we have a job
>> ID, and the variant is the job type of the job with this ID.
> 
> We already have such command: block-job-change. Which has id and type parameters, so user have to pass both, to identify the job itself and pick corresponding variant of the union type.
> 
> That would be good to somehow teach QAPI to get the type automatically from the job itself...


Seems, that's easy enough to implement such a possibility, I'll try. At least now I have a prototype, which compiles

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0ae8ae62dc..332de67e52 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3116,13 +3116,11 @@
  #
  # @id: The job identifier
  #
-# @type: The job type
-#
  # Since: 8.2
  ##
  { 'union': 'BlockJobChangeOptions',
-  'base': { 'id': 'str', 'type': 'JobType' },
-  'discriminator': 'type',
+  'base': { 'id': 'str' },
+  'discriminator': 'JobType',
    'data': { 'mirror': 'BlockJobChangeOptionsMirror' } }


to


bool visit_type_BlockJobChangeOptions_members(Visitor *v, BlockJobChangeOptions *obj, Error **errp)
{
     JobType tag;
     if (!visit_type_q_obj_BlockJobChangeOptions_base_members(v, (q_obj_BlockJobChangeOptions_base *)obj, errp)) {
         return false;
     }
     if (!BlockJobChangeOptions_mapper(obj, &tag, errp)) {
         return false;
     }
     switch (tag) {
     case JOB_TYPE_MIRROR:
         return visit_type_BlockJobChangeOptionsMirror_members(v, &obj->u.mirror, errp);
     case JOB_TYPE_COMMIT:
         break;
     ...



(specifying member as descriminator works too, and I'm not going to change the interface of block-job-change, it's just and example)

BlockJobChangeOptions_mapper() should be defined by hand, like this:

bool BlockJobChangeOptions_mapper(BlockJobChangeOptions *opts, JobType *out,
                                   Error **errp)
{
     BlockJob *job;

     JOB_LOCK_GUARD();

     job = find_block_job_locked(opts->id, errp);
     if (!job) {
         return false;
     }

     return job_type(&job->job);
}
Kevin Wolf March 12, 2024, 3:49 p.m. UTC | #21
Am 12.03.2024 um 14:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> On 11.03.24 18:15, Vladimir Sementsov-Ogievskiy wrote:
> > On 08.03.24 11:52, Kevin Wolf wrote:
> > > Am 07.03.2024 um 20:42 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > On 04.03.24 14:09, Peter Krempa wrote:
> > > > > On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
> > > > > > Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > > > > On 03.11.23 18:56, Markus Armbruster wrote:
> > > > > > > > Kevin Wolf<kwolf@redhat.com>  writes:
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > > > Is the job abstraction a failure?
> > > > > > > > 
> > > > > > > > We have
> > > > > > > > 
> > > > > > > >        block-job- command      since   job- command    since
> > > > > > > >        -----------------------------------------------------
> > > > > > > >        block-job-set-speed     1.1
> > > > > > > >        block-job-cancel        1.1     job-cancel      3.0
> > > > > > > >        block-job-pause         1.3     job-pause       3.0
> > > > > > > >        block-job-resume        1.3     job-resume      3.0
> > > > > > > >        block-job-complete      1.3     job-complete    3.0
> > > > > > > >        block-job-dismiss       2.12    job-dismiss     3.0
> > > > > > > >        block-job-finalize      2.12    job-finalize    3.0
> > > > > > > >        block-job-change        8.2
> > > > > > > >        query-block-jobs        1.1     query-jobs
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > I consider these strictly optional. We don't really have strong reasons
> > > > > > to deprecate these commands (they are just thin wrappers), and I think
> > > > > > libvirt still uses block-job-* in some places.
> > > > > 
> > > > > Libvirt uses 'block-job-cancel' because it has different semantics from
> > > > > 'job-cancel' which libvirt documented as the behaviour of the API that
> > > > > uses it. (Semantics regarding the expectation of what is written to the
> > > > > destination node at the point when the job is cancelled).
> > > > > 
> > > > 
> > > > That's the following semantics:
> > > > 
> > > >    # Note that if you issue 'block-job-cancel' after 'drive-mirror' has
> > > >    # indicated (via the event BLOCK_JOB_READY) that the source and
> > > >    # destination are synchronized, then the event triggered by this
> > > >    # command changes to BLOCK_JOB_COMPLETED, to indicate that the
> > > >    # mirroring has ended and the destination now has a point-in-time copy
> > > >    # tied to the time of the cancellation.
> > > > 
> > > > Hmm. Looking at this, it looks for me, that should probably a
> > > > 'block-job-complete" command (as leading to BLOCK_JOB_COMPLETED).
> > > 
> > > Yes, it's just a different completion mode.
> > > 
> > > > Actually, what is the difference between block-job-complete and
> > > > block-job-cancel(force=false) for mirror in ready state?
> > > > 
> > > > I only see the following differencies:
> > > > 
> > > > 1. block-job-complete documents that it completes the job
> > > >     synchronously.. But looking at mirror code I see it just set
> > > >     s->should_complete = true, which will be then handled
> > > >     asynchronously..  So I doubt that documentation is correct.
> > > > 
> > > > 2. block-job-complete will trigger final graph changes.
> > > >     block-job-cancel will not.
> > > > 
> > > > Is [2] really useful? Seems yes: in case of some failure before
> > > > starting migration target, we'd like to continue executing source. So,
> > > > no reason to break block-graph in source, better keep it unchanged.
> > > > 
> > > > But I think, such behavior better be setup by mirror-job start
> > > > parameter, rather then by special option for cancel (or even
> > > > compelete) command, useful only for mirror.
> > > 
> > > I'm not sure, having the option on the complete command makes more sense
> > > to me than having it in blockdev-mirror.
> > > 
> > > I do see the challenge of representing this meaningfully in QAPI,
> > > though. Semantically it should be a union with job-specific options and
> > > only mirror adds the graph-changes option. But the union variant
> > > can't be directly selected from another option - instead we have a job
> > > ID, and the variant is the job type of the job with this ID.
> > 
> > We already have such command: block-job-change. Which has id and type parameters, so user have to pass both, to identify the job itself and pick corresponding variant of the union type.
> > 
> > That would be good to somehow teach QAPI to get the type automatically from the job itself...
> 
> 
> Seems, that's easy enough to implement such a possibility, I'll try. At least now I have a prototype, which compiles
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 0ae8ae62dc..332de67e52 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -3116,13 +3116,11 @@
>  #
>  # @id: The job identifier
>  #
> -# @type: The job type
> -#
>  # Since: 8.2
>  ##
>  { 'union': 'BlockJobChangeOptions',
> -  'base': { 'id': 'str', 'type': 'JobType' },
> -  'discriminator': 'type',
> +  'base': { 'id': 'str' },
> +  'discriminator': 'JobType',
>    'data': { 'mirror': 'BlockJobChangeOptionsMirror' } }

We probably need some different syntax because I think in theory we
could get conflicts between option names and type names. But I'll leave
this discussion to Markus. I hope you can figure out something that he
is willing to accept, getting the variant from a callback looks like
useful functionality.

For output, your implementation is probably not optimal because you're
going to look up things the calling code already knows, but it's
probably not the end of the world.

Kevin
Vladimir Sementsov-Ogievskiy March 12, 2024, 6:52 p.m. UTC | #22
On 12.03.24 18:49, Kevin Wolf wrote:
> Am 12.03.2024 um 14:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> On 11.03.24 18:15, Vladimir Sementsov-Ogievskiy wrote:
>>> On 08.03.24 11:52, Kevin Wolf wrote:
>>>> Am 07.03.2024 um 20:42 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>> On 04.03.24 14:09, Peter Krempa wrote:
>>>>>> On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
>>>>>>> Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>>>>> On 03.11.23 18:56, Markus Armbruster wrote:
>>>>>>>>> Kevin Wolf<kwolf@redhat.com>  writes:
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>>> Is the job abstraction a failure?
>>>>>>>>>
>>>>>>>>> We have
>>>>>>>>>
>>>>>>>>>         block-job- command      since   job- command    since
>>>>>>>>>         -----------------------------------------------------
>>>>>>>>>         block-job-set-speed     1.1
>>>>>>>>>         block-job-cancel        1.1     job-cancel      3.0
>>>>>>>>>         block-job-pause         1.3     job-pause       3.0
>>>>>>>>>         block-job-resume        1.3     job-resume      3.0
>>>>>>>>>         block-job-complete      1.3     job-complete    3.0
>>>>>>>>>         block-job-dismiss       2.12    job-dismiss     3.0
>>>>>>>>>         block-job-finalize      2.12    job-finalize    3.0
>>>>>>>>>         block-job-change        8.2
>>>>>>>>>         query-block-jobs        1.1     query-jobs
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>> I consider these strictly optional. We don't really have strong reasons
>>>>>>> to deprecate these commands (they are just thin wrappers), and I think
>>>>>>> libvirt still uses block-job-* in some places.
>>>>>>
>>>>>> Libvirt uses 'block-job-cancel' because it has different semantics from
>>>>>> 'job-cancel' which libvirt documented as the behaviour of the API that
>>>>>> uses it. (Semantics regarding the expectation of what is written to the
>>>>>> destination node at the point when the job is cancelled).
>>>>>>
>>>>>
>>>>> That's the following semantics:
>>>>>
>>>>>     # Note that if you issue 'block-job-cancel' after 'drive-mirror' has
>>>>>     # indicated (via the event BLOCK_JOB_READY) that the source and
>>>>>     # destination are synchronized, then the event triggered by this
>>>>>     # command changes to BLOCK_JOB_COMPLETED, to indicate that the
>>>>>     # mirroring has ended and the destination now has a point-in-time copy
>>>>>     # tied to the time of the cancellation.
>>>>>
>>>>> Hmm. Looking at this, it looks for me, that should probably a
>>>>> 'block-job-complete" command (as leading to BLOCK_JOB_COMPLETED).
>>>>
>>>> Yes, it's just a different completion mode.
>>>>
>>>>> Actually, what is the difference between block-job-complete and
>>>>> block-job-cancel(force=false) for mirror in ready state?
>>>>>
>>>>> I only see the following differencies:
>>>>>
>>>>> 1. block-job-complete documents that it completes the job
>>>>>      synchronously.. But looking at mirror code I see it just set
>>>>>      s->should_complete = true, which will be then handled
>>>>>      asynchronously..  So I doubt that documentation is correct.
>>>>>
>>>>> 2. block-job-complete will trigger final graph changes.
>>>>>      block-job-cancel will not.
>>>>>
>>>>> Is [2] really useful? Seems yes: in case of some failure before
>>>>> starting migration target, we'd like to continue executing source. So,
>>>>> no reason to break block-graph in source, better keep it unchanged.
>>>>>
>>>>> But I think, such behavior better be setup by mirror-job start
>>>>> parameter, rather then by special option for cancel (or even
>>>>> compelete) command, useful only for mirror.
>>>>
>>>> I'm not sure, having the option on the complete command makes more sense
>>>> to me than having it in blockdev-mirror.
>>>>
>>>> I do see the challenge of representing this meaningfully in QAPI,
>>>> though. Semantically it should be a union with job-specific options and
>>>> only mirror adds the graph-changes option. But the union variant
>>>> can't be directly selected from another option - instead we have a job
>>>> ID, and the variant is the job type of the job with this ID.
>>>
>>> We already have such command: block-job-change. Which has id and type parameters, so user have to pass both, to identify the job itself and pick corresponding variant of the union type.
>>>
>>> That would be good to somehow teach QAPI to get the type automatically from the job itself...
>>
>>
>> Seems, that's easy enough to implement such a possibility, I'll try. At least now I have a prototype, which compiles
>>
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index 0ae8ae62dc..332de67e52 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -3116,13 +3116,11 @@
>>   #
>>   # @id: The job identifier
>>   #
>> -# @type: The job type
>> -#
>>   # Since: 8.2
>>   ##
>>   { 'union': 'BlockJobChangeOptions',
>> -  'base': { 'id': 'str', 'type': 'JobType' },
>> -  'discriminator': 'type',
>> +  'base': { 'id': 'str' },
>> +  'discriminator': 'JobType',
>>     'data': { 'mirror': 'BlockJobChangeOptionsMirror' } }
> 
> We probably need some different syntax because I think in theory we
> could get conflicts between option names and type names. 

I think we shouldn't, as enum types in QAPI are CamelCase and members are lowercase. Still, it's probably a bad way to rely on text case (I just imagine documenting this:)) It could be "'discriminator-type': 'JobType'" instead.

> But I'll leave
> this discussion to Markus. I hope you can figure out something that he
> is willing to accept, getting the variant from a callback looks like
> useful functionality.
> 
> For output, your implementation is probably not optimal because you're
> going to look up things the calling code already knows, but it's
> probably not the end of the world.
> 

Yes I thought about it, we'll call find_block_job_locked() twice. That's probably solvable by including some opaque pointer to QAPI structure, which could be set during json parsing and then reused by qmp_* command itself.. But I don't think it worth doing for simple search for a job. Simpler approach would be to cache globally last result of find_block_job().. But again that would be extra optimization