diff mbox series

[29/42] qapi: Add "Details:" disambiguation marker

Message ID 20250205231208.1480762-30-jsnow@redhat.com (mailing list archive)
State New
Headers show
Series docs: add sphinx-domain rST generator to qapidoc | expand

Commit Message

John Snow Feb. 5, 2025, 11:11 p.m. UTC
This clarifies sections that are mistaken by the parser as "intro"
sections to be "details" sections instead.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 qapi/machine.json      | 2 ++
 qapi/migration.json    | 4 ++++
 qapi/qom.json          | 4 ++++
 qapi/yank.json         | 2 ++
 scripts/qapi/parser.py | 8 ++++++++
 5 files changed, 20 insertions(+)

Comments

Markus Armbruster Feb. 12, 2025, 9:37 a.m. UTC | #1
John Snow <jsnow@redhat.com> writes:

> This clarifies sections that are mistaken by the parser as "intro"
> sections to be "details" sections instead.

Impact on output?  See notes inline.

>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  qapi/machine.json      | 2 ++
>  qapi/migration.json    | 4 ++++
>  qapi/qom.json          | 4 ++++
>  qapi/yank.json         | 2 ++
>  scripts/qapi/parser.py | 8 ++++++++
>  5 files changed, 20 insertions(+)
>
> diff --git a/qapi/machine.json b/qapi/machine.json
> index a6b8795b09e..3c1b397f6cc 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -1301,6 +1301,8 @@
>  # Return the amount of initially allocated and present hotpluggable
>  # (if enabled) memory in bytes.
>  #
> +# Details:
> +#
>  # .. qmp-example::
>  #
>  #     -> { "execute": "query-memory-size-summary" }

Output unchanged in my testing.  Same for the other hunks unless
otherwise noted.

> diff --git a/qapi/migration.json b/qapi/migration.json
> index 43babd1df41..9070a91e655 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1920,6 +1920,8 @@
>  #
>  # Xen uses this command to notify replication to trigger a checkpoint.
>  #
> +# Details:
> +#
>  # .. qmp-example::
>  #
>  #     -> { "execute": "xen-colo-do-checkpoint" }
> @@ -1993,6 +1995,8 @@
>  #
>  # Pause a migration.  Currently it only supports postcopy.
>  #
> +# Details:
> +#
>  # .. qmp-example::
>  #
>  #     -> { "execute": "migrate-pause" }
> diff --git a/qapi/qom.json b/qapi/qom.json
> index 11277d1f84c..5d285ef9239 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -729,6 +729,8 @@
>  #
>  # Properties for memory-backend-shm objects.
>  #
> +# Details:
> +#
>  # This memory backend supports only shared memory, which is the
>  # default.
>  #

The paragraphs moves from above to below the auto-generated member
documentation, like this:

    @@ -25908,13 +25908,13 @@ If

     Properties for memory-backend-shm objects.

    -This memory backend supports only shared memory, which is the default.
    -

     Members
     ~~~~~~~

     The members of "MemoryBackendProperties"
    +This memory backend supports only shared memory, which is the default.
    +

     Since
     ~~~~~

This is sphinx-build -b text.  I don't understand why there is no blank
line between "The members of ... " and the moved paragraph.

> @@ -744,6 +746,8 @@
>  #
>  # Properties for memory-backend-epc objects.
>  #
> +# Details:
> +#
>  # The @merge boolean option is false by default with epc
>  #
>  # The @dump boolean option is false by default with epc

Likewise.

> diff --git a/qapi/yank.json b/qapi/yank.json
> index 30f46c97c98..4d36d21e76a 100644
> --- a/qapi/yank.json
> +++ b/qapi/yank.json
> @@ -104,6 +104,8 @@
>  #
>  # Returns: list of @YankInstance
>  #
> +# Details:
> +#
>  # .. qmp-example::
>  #
>  #     -> { "execute": "query-yank" }
> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> index c5d2b950a82..5890a13b5ba 100644
> --- a/scripts/qapi/parser.py
> +++ b/scripts/qapi/parser.py
> @@ -544,6 +544,14 @@ def _tag_check(what: str) -> None:
>                          raise QAPIParseError(
>                              self, 'feature descriptions expected')
>                      have_tagged = True
> +                elif line == 'Details:':
> +                    _tag_check("Details")
> +                    self.accept(False)
> +                    line = self.get_doc_line()
> +                    while line == '':
> +                        self.accept(False)
> +                        line = self.get_doc_line()
> +                    have_tagged = True
>                  elif match := self._match_at_name_colon(line):
>                      # description
>                      if have_tagged:
Markus Armbruster Feb. 17, 2025, 10:51 a.m. UTC | #2
John Snow <jsnow@redhat.com> writes:

> This clarifies sections that are mistaken by the parser as "intro"
> sections to be "details" sections instead.
>
> Signed-off-by: John Snow <jsnow@redhat.com>

Is this missing announce-self in net.json?

diff --git a/qapi/net.json b/qapi/net.json
index 49bc7de64e..44ed72dbe9 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -948,7 +948,7 @@
 # switches.  This can be useful when network bonds fail-over the
 # active slave.
 #
-# TODO: This line is a hack to separate the example from the body
+# Details:
 #
 # .. qmp-example::
 #
Markus Armbruster Feb. 17, 2025, 11:55 a.m. UTC | #3
John Snow <jsnow@redhat.com> writes:

> This clarifies sections that are mistaken by the parser as "intro"
> sections to be "details" sections instead.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  qapi/machine.json      | 2 ++
>  qapi/migration.json    | 4 ++++
>  qapi/qom.json          | 4 ++++
>  qapi/yank.json         | 2 ++
>  scripts/qapi/parser.py | 8 ++++++++
>  5 files changed, 20 insertions(+)

Missing updates for the new syntax

* Documentation: docs/devel/qapi-code-gen.rst

* Positive test case(s): tests/qapi-schema/doc-good.json

* Maybe a negative test case for _tag_check() failure

[...]

> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> index c5d2b950a82..5890a13b5ba 100644
> --- a/scripts/qapi/parser.py
> +++ b/scripts/qapi/parser.py
> @@ -544,6 +544,14 @@ def _tag_check(what: str) -> None:
>                          raise QAPIParseError(
>                              self, 'feature descriptions expected')
>                      have_tagged = True
> +                elif line == 'Details:':
> +                    _tag_check("Details")

This one.

> +                    self.accept(False)
> +                    line = self.get_doc_line()
> +                    while line == '':
> +                        self.accept(False)
> +                        line = self.get_doc_line()
> +                    have_tagged = True
>                  elif match := self._match_at_name_colon(line):
>                      # description
>                      if have_tagged:
Markus Armbruster Feb. 17, 2025, 12:13 p.m. UTC | #4
John Snow <jsnow@redhat.com> writes:

> This clarifies sections that are mistaken by the parser as "intro"
> sections to be "details" sections instead.
>
> Signed-off-by: John Snow <jsnow@redhat.com>

This is rather terse.

Why does the boundary between "intro" (previously "body") and "details"
matter?  As far as I understand, it matters for inlining.

What is inlining?

The old doc generator emits "The members of T" into the argument
description in the following cases:

* When a command's arguments are given as a type T, the doc comment has
  no argument descriptions, and the generated argument description
  becomes "The members of T".

* When an object type has a base type T, "The members of T" is appended
  to the doc comment's (possibly empty) argument descriptions.

* For union types, "The members of T when TAG is VALUE" is appended to
  the doc comment's argument descriptions for every tag VALUE and
  associated type T.

We want a description of the members of T right there instead.  To get
it right there, we need to inline from T's documentation.

What exactly do we need to inline?  Turns out we don't want "intro", we
do want the argument descriptions and other stuff we can ignore here.

"intro" ends before the argument descriptions, features, or a tagged
section, whatever comes first.  Most of the time, this works fine.  But
there are a few troublesome cases.  Here's one:

    ##
    # @MemoryBackendShmProperties:
    #
    # Properties for memory-backend-shm objects.
    #
    # This memory backend supports only shared memory, which is the
    # default.
    #
    # Since: 9.1
    ##
    { 'struct': 'MemoryBackendShmProperties',
      'base': 'MemoryBackendProperties',
      'data': { },
      'if': 'CONFIG_POSIX' }

Everything up to "Since:" is "intro".  Consequently, the old doc
generator emits "The members of MemoryBackendProperties" right there:

    "MemoryBackendShmProperties" (Object)
    -------------------------------------

    Properties for memory-backend-shm objects.

    This memory backend supports only shared memory, which is the default.


    Members
    ~~~~~~~

    The members of "MemoryBackendProperties"

    Since
    ~~~~~

    9.1


    If
    ~~

    "CONFIG_POSIX"

That's also where the new one inlines.  Okay so far.

This gets in turn inlined into ObjectOptions for branch
memory-backend-shm.  Since we don't inline "intro", we don't inline
"This memory backend supports only shared memory, which is the default."
That's a problem.

This patch moves the boundary between "intro" and the remainder up that
paragraph, so we don't lose that line.  It accomplishes that by giving
us syntax to manually mark the end of "intro"

However, your solution is manual: it gives us the means[*] to mark the
boundary with "Details:" to avoid loss of text.  What if we don't
notice?  Should we tweak the syntax to force us to be explicit?  How
many doc comments would that affect?


[*] Actually, we have means even before this patch, they're just ugly.
See the TODO comment added in commit 14b48aaab92 (qapi: convert
"Example" sections without titles).
John Snow Feb. 18, 2025, 10:22 p.m. UTC | #5
On Wed, Feb 12, 2025 at 4:37 AM Markus Armbruster <armbru@redhat.com> wrote:

> John Snow <jsnow@redhat.com> writes:
>
> > This clarifies sections that are mistaken by the parser as "intro"
> > sections to be "details" sections instead.
>
> Impact on output?  See notes inline.
>

It's very possible that there is none; in cases where the text is not
inlined, it won't make any visual difference. The occurrences in this patch
were identified with a warning from the generator that I didn't actually
submit as part of this patch series.

I was obeying an unseen master.


>
> >
> > Signed-off-by: John Snow <jsnow@redhat.com>
> > ---
> >  qapi/machine.json      | 2 ++
> >  qapi/migration.json    | 4 ++++
> >  qapi/qom.json          | 4 ++++
> >  qapi/yank.json         | 2 ++
> >  scripts/qapi/parser.py | 8 ++++++++
> >  5 files changed, 20 insertions(+)
> >
> > diff --git a/qapi/machine.json b/qapi/machine.json
> > index a6b8795b09e..3c1b397f6cc 100644
> > --- a/qapi/machine.json
> > +++ b/qapi/machine.json
> > @@ -1301,6 +1301,8 @@
> >  # Return the amount of initially allocated and present hotpluggable
> >  # (if enabled) memory in bytes.
> >  #
> > +# Details:
> > +#
> >  # .. qmp-example::
> >  #
> >  #     -> { "execute": "query-memory-size-summary" }
>
> Output unchanged in my testing.  Same for the other hunks unless
> otherwise noted.
>
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 43babd1df41..9070a91e655 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -1920,6 +1920,8 @@
> >  #
> >  # Xen uses this command to notify replication to trigger a checkpoint.
> >  #
> > +# Details:
> > +#
> >  # .. qmp-example::
> >  #
> >  #     -> { "execute": "xen-colo-do-checkpoint" }
> > @@ -1993,6 +1995,8 @@
> >  #
> >  # Pause a migration.  Currently it only supports postcopy.
> >  #
> > +# Details:
> > +#
> >  # .. qmp-example::
> >  #
> >  #     -> { "execute": "migrate-pause" }
> > diff --git a/qapi/qom.json b/qapi/qom.json
> > index 11277d1f84c..5d285ef9239 100644
> > --- a/qapi/qom.json
> > +++ b/qapi/qom.json
> > @@ -729,6 +729,8 @@
> >  #
> >  # Properties for memory-backend-shm objects.
> >  #
> > +# Details:
> > +#
> >  # This memory backend supports only shared memory, which is the
> >  # default.
> >  #
>
> The paragraphs moves from above to below the auto-generated member
> documentation, like this:
>
>     @@ -25908,13 +25908,13 @@ If
>
>      Properties for memory-backend-shm objects.
>
>     -This memory backend supports only shared memory, which is the default.
>     -
>
>      Members
>      ~~~~~~~
>
>      The members of "MemoryBackendProperties"
>     +This memory backend supports only shared memory, which is the default.
>     +
>
>      Since
>      ~~~~~
>
> This is sphinx-build -b text.  I don't understand why there is no blank
> line between "The members of ... " and the moved paragraph.
>

... Me either! I'll investigate.


>
> > @@ -744,6 +746,8 @@
> >  #
> >  # Properties for memory-backend-epc objects.
> >  #
> > +# Details:
> > +#
> >  # The @merge boolean option is false by default with epc
> >  #
> >  # The @dump boolean option is false by default with epc
>
> Likewise.
>
> > diff --git a/qapi/yank.json b/qapi/yank.json
> > index 30f46c97c98..4d36d21e76a 100644
> > --- a/qapi/yank.json
> > +++ b/qapi/yank.json
> > @@ -104,6 +104,8 @@
> >  #
> >  # Returns: list of @YankInstance
> >  #
> > +# Details:
> > +#
> >  # .. qmp-example::
> >  #
> >  #     -> { "execute": "query-yank" }
> > diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> > index c5d2b950a82..5890a13b5ba 100644
> > --- a/scripts/qapi/parser.py
> > +++ b/scripts/qapi/parser.py
> > @@ -544,6 +544,14 @@ def _tag_check(what: str) -> None:
> >                          raise QAPIParseError(
> >                              self, 'feature descriptions expected')
> >                      have_tagged = True
> > +                elif line == 'Details:':
> > +                    _tag_check("Details")
> > +                    self.accept(False)
> > +                    line = self.get_doc_line()
> > +                    while line == '':
> > +                        self.accept(False)
> > +                        line = self.get_doc_line()
> > +                    have_tagged = True
> >                  elif match := self._match_at_name_colon(line):
> >                      # description
> >                      if have_tagged:
>
>
John Snow Feb. 18, 2025, 10:23 p.m. UTC | #6
On Mon, Feb 17, 2025 at 5:51 AM Markus Armbruster <armbru@redhat.com> wrote:

> John Snow <jsnow@redhat.com> writes:
>
> > This clarifies sections that are mistaken by the parser as "intro"
> > sections to be "details" sections instead.
> >
> > Signed-off-by: John Snow <jsnow@redhat.com>
>
> Is this missing announce-self in net.json?
>
> diff --git a/qapi/net.json b/qapi/net.json
> index 49bc7de64e..44ed72dbe9 100644
> --- a/qapi/net.json
> +++ b/qapi/net.json
> @@ -948,7 +948,7 @@
>  # switches.  This can be useful when network bonds fail-over the
>  # active slave.
>  #
> -# TODO: This line is a hack to separate the example from the body
> +# Details:
>  #
>  # .. qmp-example::
>  #
>

Yes, overlooked. The "hack" still works, so I missed it in my tests. Will
remedy, pending your other emails that I'm about to read in a second ...
John Snow Feb. 18, 2025, 10:26 p.m. UTC | #7
On Mon, Feb 17, 2025 at 6:55 AM Markus Armbruster <armbru@redhat.com> wrote:

> John Snow <jsnow@redhat.com> writes:
>
> > This clarifies sections that are mistaken by the parser as "intro"
> > sections to be "details" sections instead.
> >
> > Signed-off-by: John Snow <jsnow@redhat.com>
> > ---
> >  qapi/machine.json      | 2 ++
> >  qapi/migration.json    | 4 ++++
> >  qapi/qom.json          | 4 ++++
> >  qapi/yank.json         | 2 ++
> >  scripts/qapi/parser.py | 8 ++++++++
> >  5 files changed, 20 insertions(+)
>
> Missing updates for the new syntax
>
> * Documentation: docs/devel/qapi-code-gen.rst
>

> * Positive test case(s): tests/qapi-schema/doc-good.json
>
> * Maybe a negative test case for _tag_check() failure
>
>
Understood; I wasn't entirely sure if this concept would fly, so I saved
the polish and you got an RFC quality patch. Forgive me, please! If you
think this approach is fine, I will certainly do all the things you
outlined above.


> [...]
>
> > diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> > index c5d2b950a82..5890a13b5ba 100644
> > --- a/scripts/qapi/parser.py
> > +++ b/scripts/qapi/parser.py
> > @@ -544,6 +544,14 @@ def _tag_check(what: str) -> None:
> >                          raise QAPIParseError(
> >                              self, 'feature descriptions expected')
> >                      have_tagged = True
> > +                elif line == 'Details:':
> > +                    _tag_check("Details")
>
> This one.
>

ACK


>
> > +                    self.accept(False)
> > +                    line = self.get_doc_line()
> > +                    while line == '':
> > +                        self.accept(False)
> > +                        line = self.get_doc_line()
> > +                    have_tagged = True
> >                  elif match := self._match_at_name_colon(line):
> >                      # description
> >                      if have_tagged:
>
>
John Snow Feb. 18, 2025, 10:48 p.m. UTC | #8
On Mon, Feb 17, 2025 at 7:13 AM Markus Armbruster <armbru@redhat.com> wrote:

> John Snow <jsnow@redhat.com> writes:
>
> > This clarifies sections that are mistaken by the parser as "intro"
> > sections to be "details" sections instead.
> >
> > Signed-off-by: John Snow <jsnow@redhat.com>
>
> This is rather terse.
>

Mea culpa. I can write more at length if we agree on the general approach.
For now, you got an RFC as this was the subject of a considerable amount of
controversy between us in the past ... so I am doing baby steps.

"Commit message needs to be hit with the unterseification beam" added to
tasklist. :)


>
> Why does the boundary between "intro" (previously "body") and "details"
> matter?  As far as I understand, it matters for inlining.
>

> What is inlining?
>

> The old doc generator emits "The members of T" into the argument
> description in the following cases:
>
> * When a command's arguments are given as a type T, the doc comment has
>   no argument descriptions, and the generated argument description
>   becomes "The members of T".
>
> * When an object type has a base type T, "The members of T" is appended
>   to the doc comment's (possibly empty) argument descriptions.
>
> * For union types, "The members of T when TAG is VALUE" is appended to
>   the doc comment's argument descriptions for every tag VALUE and
>   associated type T.
>
> We want a description of the members of T right there instead.  To get
> it right there, we need to inline from T's documentation.
>
> What exactly do we need to inline?  Turns out we don't want "intro", we
> do want the argument descriptions and other stuff we can ignore here.
>
> "intro" ends before the argument descriptions, features, or a tagged
> section, whatever comes first.  Most of the time, this works fine.  But
> there are a few troublesome cases.  Here's one:
>
>     ##
>     # @MemoryBackendShmProperties:
>     #
>     # Properties for memory-backend-shm objects.
>     #
>     # This memory backend supports only shared memory, which is the
>     # default.
>     #
>     # Since: 9.1
>     ##
>     { 'struct': 'MemoryBackendShmProperties',
>       'base': 'MemoryBackendProperties',
>       'data': { },
>       'if': 'CONFIG_POSIX' }
>
> Everything up to "Since:" is "intro".  Consequently, the old doc
> generator emits "The members of MemoryBackendProperties" right there:
>
>     "MemoryBackendShmProperties" (Object)
>     -------------------------------------
>
>     Properties for memory-backend-shm objects.
>
>     This memory backend supports only shared memory, which is the default.
>
>
>     Members
>     ~~~~~~~
>
>     The members of "MemoryBackendProperties"
>
>     Since
>     ~~~~~
>
>     9.1
>
>
>     If
>     ~~
>
>     "CONFIG_POSIX"
>
> That's also where the new one inlines.  Okay so far.
>
> This gets in turn inlined into ObjectOptions for branch
> memory-backend-shm.  Since we don't inline "intro", we don't inline
> "This memory backend supports only shared memory, which is the default."
> That's a problem.
>

Yes, this is all correct so far.


>
> This patch moves the boundary between "intro" and the remainder up that
> paragraph, so we don't lose that line.  It accomplishes that by giving
> us syntax to manually mark the end of "intro"
>
> However, your solution is manual: it gives us the means[*] to mark the
> boundary with "Details:" to avoid loss of text.  What if we don't
> notice?  Should we tweak the syntax to force us to be explicit?  How
> many doc comments would that affect?
>

I'm leaving that question to you. The calculus I made was that there were
fewer SLOC changes to explicitly denote the "Details:" sections only in the
handful of cases where it was (potentially) relevant than to mandate its
use unconditionally. If you have an idea that is enforceable at runtime and
has fewer SLOC changes, suggest away!

Unseen in this patch is a warning I added to the /inliner/ that identified
potentially "ambiguous" delineation spots and issued a warning (error); the
exact code that did this is possibly a little hokey but it was what I used
to identify the spots addressed by this patch.

Point being: it's possible to enforce, but I enforced it in qapidoc.py in
the inliner instead of directly in the parser. We could discuss moving the
check to the parser if you'd like. The check itself is somewhat "dumb":

- If a doc block has only one *paragraph* (knowingly/intentionally not
using the term section here) of text, it's assumed to be the intro.
- If a doc block has any number of tagged sections, all text above (if any)
is assumed to be the "intro" and all text below (if any) is assumed to be
"details".

It's only in this case that it whines:

- A doc block has *multiple paragraphs* of text at the start of the block,
but has no other sections and so if there is semantically a "details"
section or not is unclear to the parser and inliner.

The check as I wrote it is unintelligent in that it does not bother to
check if the doc block it is checking is ever one that *could* be inlined;
i.e. it will complain about being unable to delineate for commands -- even
though it wouldn't really matter in that case. It's a potential improvement
to the algorithm to ignore cases where that "ambiguity" is not actually
important.

But, it's possible to mechanically enforce and nudge documentation writers
to add the delineation marker where the parser is uncertain.


>
> [*] Actually, we have means even before this patch, they're just ugly.
> See the TODO comment added in commit 14b48aaab92 (qapi: convert
> "Example" sections without titles)


That's right. This is merely a formalization of that hack: I add a
"section" that is intentionally empty and serves only as a marker to the
parser to begin recording a new section.
Markus Armbruster Feb. 19, 2025, 9:04 a.m. UTC | #9
John Snow <jsnow@redhat.com> writes:

> On Mon, Feb 17, 2025 at 6:55 AM Markus Armbruster <armbru@redhat.com> wrote:
>
>> John Snow <jsnow@redhat.com> writes:
>>
>> > This clarifies sections that are mistaken by the parser as "intro"
>> > sections to be "details" sections instead.
>> >
>> > Signed-off-by: John Snow <jsnow@redhat.com>
>> > ---
>> >  qapi/machine.json      | 2 ++
>> >  qapi/migration.json    | 4 ++++
>> >  qapi/qom.json          | 4 ++++
>> >  qapi/yank.json         | 2 ++
>> >  scripts/qapi/parser.py | 8 ++++++++
>> >  5 files changed, 20 insertions(+)
>>
>> Missing updates for the new syntax
>>
>> * Documentation: docs/devel/qapi-code-gen.rst
>>
>
>> * Positive test case(s): tests/qapi-schema/doc-good.json
>>
>> * Maybe a negative test case for _tag_check() failure
>>
>>
> Understood; I wasn't entirely sure if this concept would fly, so I saved
> the polish and you got an RFC quality patch. Forgive me, please! If you

As I wrote in review of PATCH 28, this is good strategy.

> think this approach is fine, I will certainly do all the things you
> outlined above.
>
>
>> [...]
>>
>> > diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
>> > index c5d2b950a82..5890a13b5ba 100644
>> > --- a/scripts/qapi/parser.py
>> > +++ b/scripts/qapi/parser.py
>> > @@ -544,6 +544,14 @@ def _tag_check(what: str) -> None:
>> >                          raise QAPIParseError(
>> >                              self, 'feature descriptions expected')
>> >                      have_tagged = True
>> > +                elif line == 'Details:':
>> > +                    _tag_check("Details")
>>
>> This one.
>>
>
> ACK
>
>
>>
>> > +                    self.accept(False)
>> > +                    line = self.get_doc_line()
>> > +                    while line == '':
>> > +                        self.accept(False)
>> > +                        line = self.get_doc_line()
>> > +                    have_tagged = True
>> >                  elif match := self._match_at_name_colon(line):
>> >                      # description
>> >                      if have_tagged:
>>
>>
Markus Armbruster Feb. 19, 2025, 12:49 p.m. UTC | #10
John Snow <jsnow@redhat.com> writes:

> On Mon, Feb 17, 2025 at 7:13 AM Markus Armbruster <armbru@redhat.com> wrote:
>
>> John Snow <jsnow@redhat.com> writes:
>>
>> > This clarifies sections that are mistaken by the parser as "intro"
>> > sections to be "details" sections instead.
>> >
>> > Signed-off-by: John Snow <jsnow@redhat.com>
>>
>> This is rather terse.
>>
>
> Mea culpa. I can write more at length if we agree on the general approach.
> For now, you got an RFC as this was the subject of a considerable amount of
> controversy between us in the past ... so I am doing baby steps.
>
> "Commit message needs to be hit with the unterseification beam" added to
> tasklist. :)
>
>
>>
>> Why does the boundary between "intro" (previously "body") and "details"
>> matter?  As far as I understand, it matters for inlining.
>>
>
>> What is inlining?
>>
>
>> The old doc generator emits "The members of T" into the argument
>> description in the following cases:
>>
>> * When a command's arguments are given as a type T, the doc comment has
>>   no argument descriptions, and the generated argument description
>>   becomes "The members of T".
>>
>> * When an object type has a base type T, "The members of T" is appended
>>   to the doc comment's (possibly empty) argument descriptions.
>>
>> * For union types, "The members of T when TAG is VALUE" is appended to
>>   the doc comment's argument descriptions for every tag VALUE and
>>   associated type T.
>>
>> We want a description of the members of T right there instead.  To get
>> it right there, we need to inline from T's documentation.
>>
>> What exactly do we need to inline?  Turns out we don't want "intro", we
>> do want the argument descriptions and other stuff we can ignore here.
>>
>> "intro" ends before the argument descriptions, features, or a tagged
>> section, whatever comes first.  Most of the time, this works fine.  But
>> there are a few troublesome cases.  Here's one:
>>
>>     ##
>>     # @MemoryBackendShmProperties:
>>     #
>>     # Properties for memory-backend-shm objects.
>>     #
>>     # This memory backend supports only shared memory, which is the
>>     # default.
>>     #
>>     # Since: 9.1
>>     ##
>>     { 'struct': 'MemoryBackendShmProperties',
>>       'base': 'MemoryBackendProperties',
>>       'data': { },
>>       'if': 'CONFIG_POSIX' }
>>
>> Everything up to "Since:" is "intro".  Consequently, the old doc
>> generator emits "The members of MemoryBackendProperties" right there:
>>
>>     "MemoryBackendShmProperties" (Object)
>>     -------------------------------------
>>
>>     Properties for memory-backend-shm objects.
>>
>>     This memory backend supports only shared memory, which is the default.
>>
>>
>>     Members
>>     ~~~~~~~
>>
>>     The members of "MemoryBackendProperties"
>>
>>     Since
>>     ~~~~~
>>
>>     9.1
>>
>>
>>     If
>>     ~~
>>
>>     "CONFIG_POSIX"
>>
>> That's also where the new one inlines.  Okay so far.
>>
>> This gets in turn inlined into ObjectOptions for branch
>> memory-backend-shm.  Since we don't inline "intro", we don't inline
>> "This memory backend supports only shared memory, which is the default."
>> That's a problem.
>>
>
> Yes, this is all correct so far.
>
>
>>
>> This patch moves the boundary between "intro" and the remainder up that
>> paragraph, so we don't lose that line.  It accomplishes that by giving
>> us syntax to manually mark the end of "intro"
>>
>> However, your solution is manual: it gives us the means[*] to mark the
>> boundary with "Details:" to avoid loss of text.  What if we don't
>> notice?  Should we tweak the syntax to force us to be explicit?  How
>> many doc comments would that affect?
>>
>
> I'm leaving that question to you. The calculus I made was that there were
> fewer SLOC changes to explicitly denote the "Details:" sections only in the
> handful of cases where it was (potentially) relevant than to mandate its
> use unconditionally.

How did you determine where it is (potentially) relevant?  Oh, wait ...

>                      If you have an idea that is enforceable at runtime and
> has fewer SLOC changes, suggest away!
>
> Unseen in this patch is a warning I added to the /inliner/ that identified
> potentially "ambiguous" delineation spots and issued a warning (error); the
> exact code that did this is possibly a little hokey but it was what I used
> to identify the spots addressed by this patch.

... that's how.

> Point being: it's possible to enforce, but I enforced it in qapidoc.py in
> the inliner instead of directly in the parser. We could discuss moving the
> check to the parser if you'd like. The check itself is somewhat "dumb":
>
> - If a doc block has only one *paragraph* (knowingly/intentionally not
> using the term section here) of text, it's assumed to be the intro.

You mean if the "body" has just one paragraph, right?  The "body" is the
first section, always untagged, possibly empty.  It's contains the text
between the line naming the definition and the first tagged section.

The tagged sections are member / argument descriptions, feature
descriptions, 'Returns', 'Errors', 'Since', and 'TODO'.

> - If a doc block has any number of tagged sections, all text above (if any)
> is assumed to be the "intro" and all text below (if any) is assumed to be
> "details".

Uh, this can't be quite right.

Consider:

    ##
    # @query-memory-size-summary:
    #
    # Return the amount of initially allocated and present hotpluggable
    # (if enabled) memory in bytes.
    #
    # .. qmp-example::
    #
    #     -> { "execute": "query-memory-size-summary" }
    #     <- { "return": { "base-memory": 4294967296, "plugged-memory": 0 } }
    #
--> # Since: 2.11
    ##

There is a tagged section.  According to your explanation, the text
above, i.e. everything between @query-memory-size-summary: and Since: is
assumed to be "intro".

According to your patch, which adds "Details:" in the middle, we do not
assume this.  Contradiction.

> It's only in this case that it whines:
>
> - A doc block has *multiple paragraphs* of text at the start of the block,
> but has no other sections and so if there is semantically a "details"
> section or not is unclear to the parser and inliner.

Let's take a step back.  docs/devel/qapi-code-gen.rst:

    Definition documentation starts with a line naming the definition,
    followed by an optional overview, a description of each argument (for
    commands and events), member (for structs and unions), branch (for
    alternates), or value (for enums), a description of each feature (if
    any), and finally optional tagged sections.

Bug: should be "finally optional tagged or untagged sections".

Your generator wants all but 'Since' and 'TODO' together, so it can
render them in a single two-column table.

This description table separates "intro" (above) and "details" (below).
Fair?

Fine and dandy separation unless the description table is *empty*.

Then the "body" (first section, always untagged) extends to the first
'Since', 'TODO', or the end of the doc comment.

Heuristic: when this first untagged section is a single paragraph, we
quietly assume it's "intro".  If it's more than one, we ask the
programmer to mark the end of "intro" explicitly.

Let's see how this works out in practice.  I stick

        if self.symbol and not (self.args or self.features or self.returns or self.errors):
            if self.body.text.find('\n\n') == -1:
                print(f"{self.info}: single para")
            else:
                print(f"{self.info}: ambiguous")

into QAPIDoc.check().  The outer conditional is true for definition
documentation (doc.symbol) where the table is empty (not ...).  The
inner conditional is a crude check for paragraphs.

This reports 47 "single para" and 8 "ambiguous" in the main QAPI schema
in master.

Your patch hits 5 of 8 ambiguous ones, and throws in a 6th that doesn't
seem to need it:

    ##
    # @query-yank:
    #
    # Query yank instances.  See @YankInstance for more information.
    #
    # Returns: list of @YankInstance
    #
    # .. qmp-example::
    #
    #     -> { "execute": "query-yank" }
    #     <- { "return": [
    #              { "type": "block-node",
    #                "node-name": "nbd0" }
    #          ] }
    #
    # Since: 6.0
    ##

It misses in run-state.json:

    ##
    # @SUSPEND_DISK:
    #
    # Emitted when guest enters a hardware suspension state with data
    # saved on disk, for example, S4 state, which is sometimes called
    # hibernate state
    #
    # .. note:: QEMU shuts down (similar to event @SHUTDOWN) when entering
    #    this state.
    #
    # Since: 1.2
    #
    # .. qmp-example::
    #
    #     <- { "event": "SUSPEND_DISK",
    #          "timestamp": { "seconds": 1344456160, "microseconds": 309119 } }
    ##

and in migration.json:

    ##
    # @migrate_cancel:
    #
    # Cancel the current executing migration process.
    #
    # .. note:: This command succeeds even if there is no migration
    #    process running.
    #
    # Since: 0.14
    #
    # .. qmp-example::
    #
    #     -> { "execute": "migrate_cancel" }
    #     <- { "return": {} }
    ##

and in machine.json

    ##
    # @HV_BALLOON_STATUS_REPORT:
    #
    # Emitted when the hv-balloon driver receives a "STATUS" message from
    # the guest.
    #
    # .. note:: This event is rate-limited.
    #
    # Since: 8.2
    #
    # .. qmp-example::
    #
    #     <- { "event": "HV_BALLOON_STATUS_REPORT",
    #          "data": { "committed": 816640000, "available": 3333054464 },
    #          "timestamp": { "seconds": 1600295492, "microseconds": 661044 } }
    ##

> The check as I wrote it is unintelligent in that it does not bother to
> check if the doc block it is checking is ever one that *could* be inlined;
> i.e. it will complain about being unable to delineate for commands -- even
> though it wouldn't really matter in that case. It's a potential improvement
> to the algorithm to ignore cases where that "ambiguity" is not actually
> important.

The ambiguity affects both doc blocks the inliner inlines from and doc
blocks the inliner inlines into.

When inlining from, the inliner omits "intro", and therefore needs to
know where "intro" ends.

When inlining into, the inliner needs to know where to insert the
inlined material.  When the answer is "right after intro", it needs to
know where "intro" ends.

Getting the former wrong loses information.  Getting the latter wrong
may look funny, which is a lot less serious, but still useful to avoid.

> But, it's possible to mechanically enforce and nudge documentation writers
> to add the delineation marker where the parser is uncertain.
>
>> [*] Actually, we have means even before this patch, they're just ugly.
>> See the TODO comment added in commit 14b48aaab92 (qapi: convert
>> "Example" sections without titles)
>
>
> That's right. This is merely a formalization of that hack: I add a
> "section" that is intentionally empty and serves only as a marker to the
> parser to begin recording a new section.

Yes.


Let's take a step back again.

Recall the problem's cause is "empty description table".  Can we enforce
non-empty?

Here's the table's syntactic structure:

    member / argument descriptions *
    ( "Features:" line
       feature descriptions ("features") + ) ?
    "Returns" section ?
    "Errors" section ?

This is slightly more strict than what we actually accept now, but
that's detail.

Consider:

    "Members:" / "Arguments:" line
    member / argument descriptions *
    ( "Features:" line
       feature descriptions ("features") + ) ?
    "Returns" section ?
    "Errors" section ?

With this, the table always starts with a "Members" / "Arguments" line,
and thus cannot be empty.

Drawback: we'd have to add this line to every single definition comment.
The main QAPI schema has almost 1000.  Tolerable?

We could require it only when there are no member / argument
descriptions.  55 instances.

We could require it only when there are none, and our "one paragraph"
heuristic for finding the end of "intro" fails.  8 instances.

You might ask what the difference to your "Details:" proposal is.  There
are two.

1. The keyword(s).  Matter of taste, best discussed last.

2. As coded, your patch accepts "Details:" almost[*] anywhere.
   "Members:" / "Arguments" would be accepted only where member / argument
   descriptions can go, i.e. not after feature descriptions etc.  Consider:

    ##
    # @Enum:
    #
    # @one: The _one_ {and only}, description on the same line
    #
    # Features:
    # @enum-feat: Also _one_ {and only}
    # @enum-member-feat: a member feature
    #
    # Details:
    #
    # @two is undocumented
    ##

   This is accepted, and the "Details:" line gets swallowed.

   I figure tightening the position makes accidents slightly less
   likely.

Here's another way to force non-empty:

    ( "Members: none" / "Arguments: none" line
    | member / argument descriptions * )
    ( "Features:" line
       feature descriptions ("features") + ) ?
    "Returns" section ?
    "Errors" section ?

This is similar to "require it only when there are no member / argument
descriptions" above, except we also accept it only then.  55 instances.

Syntax ideas better than "Members: none" are welcome.

Thoughts?


[*] Not after untagged sections following tagged ones.
diff mbox series

Patch

diff --git a/qapi/machine.json b/qapi/machine.json
index a6b8795b09e..3c1b397f6cc 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1301,6 +1301,8 @@ 
 # Return the amount of initially allocated and present hotpluggable
 # (if enabled) memory in bytes.
 #
+# Details:
+#
 # .. qmp-example::
 #
 #     -> { "execute": "query-memory-size-summary" }
diff --git a/qapi/migration.json b/qapi/migration.json
index 43babd1df41..9070a91e655 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1920,6 +1920,8 @@ 
 #
 # Xen uses this command to notify replication to trigger a checkpoint.
 #
+# Details:
+#
 # .. qmp-example::
 #
 #     -> { "execute": "xen-colo-do-checkpoint" }
@@ -1993,6 +1995,8 @@ 
 #
 # Pause a migration.  Currently it only supports postcopy.
 #
+# Details:
+#
 # .. qmp-example::
 #
 #     -> { "execute": "migrate-pause" }
diff --git a/qapi/qom.json b/qapi/qom.json
index 11277d1f84c..5d285ef9239 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -729,6 +729,8 @@ 
 #
 # Properties for memory-backend-shm objects.
 #
+# Details:
+#
 # This memory backend supports only shared memory, which is the
 # default.
 #
@@ -744,6 +746,8 @@ 
 #
 # Properties for memory-backend-epc objects.
 #
+# Details:
+#
 # The @merge boolean option is false by default with epc
 #
 # The @dump boolean option is false by default with epc
diff --git a/qapi/yank.json b/qapi/yank.json
index 30f46c97c98..4d36d21e76a 100644
--- a/qapi/yank.json
+++ b/qapi/yank.json
@@ -104,6 +104,8 @@ 
 #
 # Returns: list of @YankInstance
 #
+# Details:
+#
 # .. qmp-example::
 #
 #     -> { "execute": "query-yank" }
diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
index c5d2b950a82..5890a13b5ba 100644
--- a/scripts/qapi/parser.py
+++ b/scripts/qapi/parser.py
@@ -544,6 +544,14 @@  def _tag_check(what: str) -> None:
                         raise QAPIParseError(
                             self, 'feature descriptions expected')
                     have_tagged = True
+                elif line == 'Details:':
+                    _tag_check("Details")
+                    self.accept(False)
+                    line = self.get_doc_line()
+                    while line == '':
+                        self.accept(False)
+                        line = self.get_doc_line()
+                    have_tagged = True
                 elif match := self._match_at_name_colon(line):
                     # description
                     if have_tagged: