diff mbox series

asciidoctor-extensions: provide `<refmiscinfo/>`

Message ID 20190317144747.2418514-1-martin.agren@gmail.com (mailing list archive)
State New, archived
Headers show
Series asciidoctor-extensions: provide `<refmiscinfo/>` | expand

Commit Message

Martin Ågren March 17, 2019, 2:47 p.m. UTC
When we build with AsciiDoc, asciidoc.conf ensures that each xml-file we
generate contains some meta-information which `xmlto` can act on, based
on the following template:

  <refmeta>
  <refentrytitle>{mantitle}</refentrytitle>
  <manvolnum>{manvolnum}</manvolnum>
  <refmiscinfo class="source">Git</refmiscinfo>
  <refmiscinfo class="version">{git_version}</refmiscinfo>
  <refmiscinfo class="manual">Git Manual</refmiscinfo>
  </refmeta>

When we build with Asciidoctor, it does not honor this configuration file
and we end up with only this (for a hypothetical git-foo.xml):

  <refmeta>
  <refentrytitle>git-foo</refentrytitle>
  <manvolnum>1</manvolnum>
  </refmeta>

That is, we miss out on the `<refmiscinfo/>` tags. As a result, the
header of each man page doesn't say "Git Manual", but "git-foo(1)"
instead. Worse, the footers don't give the Git version number and
instead provide the fairly ugly "[FIXME: source]".

That Asciidoctor ignores asciidoc.conf is nothing new. This is why we
implement the `linkgit:` macro in asciidoc.conf *and* in
asciidoctor-extensions.rb. Follow suit and provide these tags in
asciidoctor-extensions.rb, using a "postprocessor" extension.

We may consider a few alternatives:

  * Provide the `mansource` attribute to Asciidoctor. This attribute
    looks promising until one realizes that it can only be given inside
    the source file (the .txt file in our case), *not* on the command
    line using `-a mansource=foobar`. I toyed with the idea of injecting
    this attribute while feeding Asciidoctor the input on stdin, but it
    didn't feel like it was worth the complexity in the Makefile.

  * Similar to that last idea, we could inject these lines into the
    xml-files from the Makefile, e.g., using `sed`. This reduces
    repetition, but seems fairly brittle. It feels wrong to impose
    another step and another risk on the Asciidoc-processing only to
    benefit the Asciidoctor-one.

  * Considering the above abandoned ideas, it seems better to put any
    complexity inside asciidoctor-extensions.rb. It is after all
    supposed to be the "equivalent" of asciidoc.conf. I considered
    providing a "tree processor" extension and use it to set, e.g.,
    `mansource` mentioned above.

Let's instead try to stay as close as possible to what asciidoc.conf
does. We'll make it fairly obvious that we aim to inject the exact same
three lines of `<refmiscinfo/>` that asciidoc.conf provides. The only
somewhat tricky part is that we inject them *post*-processing so we need
to do the variable expansion ourselves.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 Cc Todd and Peff who had a brief exchange [1] a while ago. Apparently
 Todd saw this "[FIXME: source]" on Fedora but Peff did not on Debian.
 I'm on Ubuntu 18.04 and used to see this also on 16.04. I'm not sure
 what might make Debian so special here.

 [1] https://public-inbox.org/git/20180627164443.GK20217@zaya.teonanacatl.net/

 I've based this on ma/asciidoctor-fixes, not because it's needed for
 the patch itself, but because it provides a15ef383e7
 ("Documentation/Makefile: add missing dependency on
 asciidoctor-extensions", 2019-02-27), which ensures that the
 documentation will be rebuilt. I'm hoping this was the right call.

 Documentation/asciidoctor-extensions.rb | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Todd Zullinger March 17, 2019, 7:44 p.m. UTC | #1
Hi Martin,

Martin Ågren wrote:
> When we build with AsciiDoc, asciidoc.conf ensures that each xml-file we
> generate contains some meta-information which `xmlto` can act on, based
> on the following template:
> 
>   <refmeta>
>   <refentrytitle>{mantitle}</refentrytitle>
>   <manvolnum>{manvolnum}</manvolnum>
>   <refmiscinfo class="source">Git</refmiscinfo>
>   <refmiscinfo class="version">{git_version}</refmiscinfo>
>   <refmiscinfo class="manual">Git Manual</refmiscinfo>
>   </refmeta>
> 
> When we build with Asciidoctor, it does not honor this configuration file
> and we end up with only this (for a hypothetical git-foo.xml):
> 
>   <refmeta>
>   <refentrytitle>git-foo</refentrytitle>
>   <manvolnum>1</manvolnum>
>   </refmeta>
> 
> That is, we miss out on the `<refmiscinfo/>` tags. As a result, the
> header of each man page doesn't say "Git Manual", but "git-foo(1)"
> instead. Worse, the footers don't give the Git version number and
> instead provide the fairly ugly "[FIXME: source]".
> 
> That Asciidoctor ignores asciidoc.conf is nothing new. This is why we
> implement the `linkgit:` macro in asciidoc.conf *and* in
> asciidoctor-extensions.rb. Follow suit and provide these tags in
> asciidoctor-extensions.rb, using a "postprocessor" extension.

Nice!  I looked at this a long time ago and didn't figure
out how to use a postprocessor extension.  From my notes, I
found discussions about using ruby's tilt for templating and
it all seemed way too convoluted.

Your method looks quite simple and elegant.

> We may consider a few alternatives:
> 
>   * Provide the `mansource` attribute to Asciidoctor. This attribute
>     looks promising until one realizes that it can only be given inside
>     the source file (the .txt file in our case), *not* on the command
>     line using `-a mansource=foobar`. I toyed with the idea of injecting
>     this attribute while feeding Asciidoctor the input on stdin, but it
>     didn't feel like it was worth the complexity in the Makefile.

I played with this direction before.  Using Asciidoctor we
can convert directly from .txt to man without docbook
and xmlto.  That does have some other issues which need to
be worked out though.  Here's what I had as a start:

-- 8< --
Subject: [PATCH] WIP: doc: improve asciidoctor manpage generation

Avoid 'FIXME: Source' by setting mansource.  Skip xmlto step and render
manpages directly with asciidoctor.

TODO:
    - apply to all man pages
    - fix links to html docs, like user-manual.html in git.1 (currently
      it is listed in brackets inline rather than as a footnote)

Reference:
https://lore.kernel.org/lkml/20180424150456.17353-1-tiwai@suse.de/
Signed-off-by: Todd Zullinger <tmz@pobox.com>
---
 Documentation/Makefile                  | 8 ++++++++
 Documentation/asciidoctor-extensions.rb | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/Documentation/Makefile b/Documentation/Makefile
index a9697f5146..494f8c9464 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -197,6 +197,7 @@ ASCIIDOC_DOCBOOK = docbook45
 ASCIIDOC_EXTRA += -acompat-mode -atabsize=8
 ASCIIDOC_EXTRA += -I. -rasciidoctor-extensions
 ASCIIDOC_EXTRA += -alitdd='&\#x2d;&\#x2d;'
+ASCIIDOC_EXTRA += -a mansource="Git $(GIT_VERSION)" -a manmanual="Git Manual"
 DBLATEX_COMMON =
 endif
 
@@ -354,9 +355,16 @@ $(OBSOLETE_HTML): %.html : %.txto asciidoc.conf
 manpage-base-url.xsl: manpage-base-url.xsl.in
 	$(QUIET_GEN)sed "s|@@MAN_BASE_URL@@|$(MAN_BASE_URL)|" $< > $@
 
+ifndef USE_ASCIIDOCTOR
 %.1 %.5 %.7 : %.xml manpage-base-url.xsl $(wildcard manpage*.xsl)
 	$(QUIET_XMLTO)$(RM) $@ && \
 	$(XMLTO) -m $(MANPAGE_XSL) $(XMLTO_EXTRA) man $<
+else
+%.1 %.5 %.7 : %.txt
+	$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
+	$(ASCIIDOC_COMMON) -b manpage -d manpage -o $@+ $< && \
+	mv $@+ $@
+endif
 
 %.xml : %.txt asciidoc.conf asciidoctor-extensions.rb
 	$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
diff --git a/Documentation/asciidoctor-extensions.rb b/Documentation/asciidoctor-extensions.rb
index f7a5982f8b..ebb078807a 100644
--- a/Documentation/asciidoctor-extensions.rb
+++ b/Documentation/asciidoctor-extensions.rb
@@ -12,6 +12,8 @@ module Git
         if parent.document.basebackend? 'html'
           prefix = parent.document.attr('git-relative-html-prefix')
           %(<a href="#{prefix}#{target}.html">#{target}(#{attrs[1]})</a>\n)
+        elsif parent.document.basebackend? 'manpage'
+          "#{target}(#{attrs[1]})"
         elsif parent.document.basebackend? 'docbook'
           "<citerefentry>\n" \
             "<refentrytitle>#{target}</refentrytitle>" \
-- 8< --

That was based on ma/asciidoctor-extensions, but it may be
missing other recent improvements you've made to the make
targets.  It's been a month or so since I worked on it.

I munged up doc-diff to set MANDWIDTH=1000 and set one
branch to default to asciidoctor to compare.  (Your other
recent series looks like it'll make doing asciidoc and
asciidoctor comparisons easier.)

There were a number of differences that I didn't work
through though.  Most importantly was the change in the
links noted in the commit message.

Thanks for working on asciidoctor.  I've been trying to poke
it off and on to help ensure the docs can be built if
asciidoc ever gets dropped from Fedora.
Martin Ågren March 17, 2019, 8:03 p.m. UTC | #2
Hi Todd,

On Sun, 17 Mar 2019 at 20:44, Todd Zullinger <tmz@pobox.com> wrote:
>
> Hi Martin,
>
> Martin Ågren wrote:
> > That is, we miss out on the `<refmiscinfo/>` tags. As a result, the
> > header of each man page doesn't say "Git Manual", but "git-foo(1)"
> > instead. Worse, the footers don't give the Git version number and
> > instead provide the fairly ugly "[FIXME: source]".
> >
> > That Asciidoctor ignores asciidoc.conf is nothing new. This is why we
> > implement the `linkgit:` macro in asciidoc.conf *and* in
> > asciidoctor-extensions.rb. Follow suit and provide these tags in
> > asciidoctor-extensions.rb, using a "postprocessor" extension.
>
> > We may consider a few alternatives:
> >
> >   * Provide the `mansource` attribute to Asciidoctor. This attribute
> >     looks promising until one realizes that it can only be given inside
> >     the source file (the .txt file in our case), *not* on the command
> >     line using `-a mansource=foobar`. I toyed with the idea of injecting
> >     this attribute while feeding Asciidoctor the input on stdin, but it
> >     didn't feel like it was worth the complexity in the Makefile.
>
> I played with this direction before.  Using Asciidoctor we
> can convert directly from .txt to man without docbook
> and xmlto.  That does have some other issues which need to
> be worked out though.  Here's what I had as a start:
>
> -- 8< --
> Subject: [PATCH] WIP: doc: improve asciidoctor manpage generation
>
> Avoid 'FIXME: Source' by setting mansource.  Skip xmlto step and render
> manpages directly with asciidoctor.
>
> TODO:
>     - apply to all man pages
>     - fix links to html docs, like user-manual.html in git.1 (currently
>       it is listed in brackets inline rather than as a footnote)
>
> Reference:
> https://lore.kernel.org/lkml/20180424150456.17353-1-tiwai@suse.de/
> Signed-off-by: Todd Zullinger <tmz@pobox.com>
> ---
>  Documentation/Makefile                  | 8 ++++++++
>  Documentation/asciidoctor-extensions.rb | 2 ++
>  2 files changed, 10 insertions(+)
>
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index a9697f5146..494f8c9464 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -197,6 +197,7 @@ ASCIIDOC_DOCBOOK = docbook45
>  ASCIIDOC_EXTRA += -acompat-mode -atabsize=8
>  ASCIIDOC_EXTRA += -I. -rasciidoctor-extensions
>  ASCIIDOC_EXTRA += -alitdd='&\#x2d;&\#x2d;'
> +ASCIIDOC_EXTRA += -a mansource="Git $(GIT_VERSION)" -a manmanual="Git Manual"
>  DBLATEX_COMMON =
>  endif
>
> @@ -354,9 +355,16 @@ $(OBSOLETE_HTML): %.html : %.txto asciidoc.conf
>  manpage-base-url.xsl: manpage-base-url.xsl.in
>         $(QUIET_GEN)sed "s|@@MAN_BASE_URL@@|$(MAN_BASE_URL)|" $< > $@
>
> +ifndef USE_ASCIIDOCTOR
>  %.1 %.5 %.7 : %.xml manpage-base-url.xsl $(wildcard manpage*.xsl)
>         $(QUIET_XMLTO)$(RM) $@ && \
>         $(XMLTO) -m $(MANPAGE_XSL) $(XMLTO_EXTRA) man $<
> +else
> +%.1 %.5 %.7 : %.txt
> +       $(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
> +       $(ASCIIDOC_COMMON) -b manpage -d manpage -o $@+ $< && \
> +       mv $@+ $@
> +endif
>
>  %.xml : %.txt asciidoc.conf asciidoctor-extensions.rb
>         $(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
> diff --git a/Documentation/asciidoctor-extensions.rb b/Documentation/asciidoctor-extensions.rb
> index f7a5982f8b..ebb078807a 100644
> --- a/Documentation/asciidoctor-extensions.rb
> +++ b/Documentation/asciidoctor-extensions.rb
> @@ -12,6 +12,8 @@ module Git
>          if parent.document.basebackend? 'html'
>            prefix = parent.document.attr('git-relative-html-prefix')
>            %(<a href="#{prefix}#{target}.html">#{target}(#{attrs[1]})</a>\n)
> +        elsif parent.document.basebackend? 'manpage'
> +          "#{target}(#{attrs[1]})"
>          elsif parent.document.basebackend? 'docbook'
>            "<citerefentry>\n" \
>              "<refentrytitle>#{target}</refentrytitle>" \
> -- 8< --
>
> That was based on ma/asciidoctor-extensions, but it may be
> missing other recent improvements you've made to the make
> targets.  It's been a month or so since I worked on it.
>
> I munged up doc-diff to set MANDWIDTH=1000 and set one
> branch to default to asciidoctor to compare.  (Your other
> recent series looks like it'll make doing asciidoc and
> asciidoctor comparisons easier.)
>
> There were a number of differences that I didn't work
> through though.  Most importantly was the change in the
> links noted in the commit message.

Your approach looks like the correct one long term, at least.

I'll try to play with this to see if I can figure out the differences.
That will have to wait until tomorrow though.

Thanks for sharing your progress.

Martin
Jeff King March 19, 2019, 2:46 a.m. UTC | #3
On Sun, Mar 17, 2019 at 03:47:47PM +0100, Martin Ågren wrote:

>  Cc Todd and Peff who had a brief exchange [1] a while ago. Apparently
>  Todd saw this "[FIXME: source]" on Fedora but Peff did not on Debian.
>  I'm on Ubuntu 18.04 and used to see this also on 16.04. I'm not sure
>  what might make Debian so special here.

I think it was just that my version of asciidoctor had

  https://github.com/asciidoctor/asciidoctor/pull/2636

and Todd's did not. However, mine still does not do the _right_ thing,
because we didn't pass the right attributes in to asciidoctor. It just
didn't print an ugly "FIXME". Looking at the XML, I have:

  <refentrytitle>git-add</refentrytitle>
  <manvolnum>1</manvolnum>
  <refmiscinfo class="source">&#160;</refmiscinfo>
  <refmiscinfo class="manual">&#160;</refmiscinfo>
  </refmeta>

So it's just an nbsp instead of the real content, and the "version"
field is missing entirely.

> That Asciidoctor ignores asciidoc.conf is nothing new. This is why we
> implement the `linkgit:` macro in asciidoc.conf *and* in
> asciidoctor-extensions.rb. Follow suit and provide these tags in
> asciidoctor-extensions.rb, using a "postprocessor" extension.

Yeah, that seems sensible overall. Some thoughts on your approach:

>   * Provide the `mansource` attribute to Asciidoctor. This attribute
>     looks promising until one realizes that it can only be given inside
>     the source file (the .txt file in our case), *not* on the command
>     line using `-a mansource=foobar`. I toyed with the idea of injecting
>     this attribute while feeding Asciidoctor the input on stdin, but it
>     didn't feel like it was worth the complexity in the Makefile.

It does seem like "mansource" is the way asciidoctor expects us to do
this. Why doesn't it work from the command line? Is it a bug in
asciidoctor, or is there something more subtle going on?

I think even if it is a bug and gets fixed, though, it still wouldn't
have the version field (though that seems like something we could
contribute to asciidoctor).

>   * Similar to that last idea, we could inject these lines into the
>     xml-files from the Makefile, e.g., using `sed`. This reduces
>     repetition, but seems fairly brittle. It feels wrong to impose
>     another step and another risk on the Asciidoc-processing only to
>     benefit the Asciidoctor-one.

That's more or less what your ruby code is doing on. That said, I'd just
as soon do it in ruby as with a separate sed invocation. At least then
the external build is the same.

>   * Considering the above abandoned ideas, it seems better to put any
>     complexity inside asciidoctor-extensions.rb. It is after all
>     supposed to be the "equivalent" of asciidoc.conf. I considered
>     providing a "tree processor" extension and use it to set, e.g.,
>     `mansource` mentioned above.

This seems like the least bad option, at least for now. Your code does
do a generic regex substitution. The promise of XML is that we're
supposed to be able to do structured, robust transformations of the
document. But my experience has been that the tooling is sufficiently
difficult to work with that you just end up writing a regex.

So I'm curious if you tried to use an actual XML parser (or god forbid,
XSLT) to do the transformation. But if you spent more than 5 minutes on
it and got disgusted, I wouldn't ask you to look deeper than that. :)

I doubt we'd see any other refmeta tags (and any non-tag content would
be quoted).

> Let's instead try to stay as close as possible to what asciidoc.conf
> does. We'll make it fairly obvious that we aim to inject the exact same
> three lines of `<refmiscinfo/>` that asciidoc.conf provides. The only
> somewhat tricky part is that we inject them *post*-processing so we need
> to do the variable expansion ourselves.

One thing that asciidoctor buys us that asciidoc does not is that we
might eventually move to directly generating the manpages, without the
XML / Docbook step in between. And if we do, then all of this XML
hackery is going to have to get replaced with something else. I guess we
can cross that bridge when we come to it.

> diff --git a/Documentation/asciidoctor-extensions.rb b/Documentation/asciidoctor-extensions.rb
> index 0089e0cfb8..059279dee1 100644
> --- a/Documentation/asciidoctor-extensions.rb
> +++ b/Documentation/asciidoctor-extensions.rb
> @@ -20,9 +20,25 @@ module Git
>          end
>        end
>      end
> +
> +    class DocumentPostProcessor < Asciidoctor::Extensions::Postprocessor
> +      def process document, output
> +        if document.basebackend? 'docbook'
> +          git_version = document.attributes['git_version']
> +          replacement = "" \
> +            "<refmiscinfo class=\"source\">Git</refmiscinfo>\n" \
> +            "<refmiscinfo class=\"version\">#{git_version}</refmiscinfo>\n" \
> +            "<refmiscinfo class=\"manual\">Git Manual</refmiscinfo>\n" \
> +            "<\/refmeta>"
> +          output = output.sub(/<\/refmeta>/, replacement)
> +        end
> +        output
> +      end
> +    end

The patch itself looks sane. Would we ever need to XML-quote the
contents of git_version? I guess the asciidoc.conf version doesn't
bother. Technically the user running "make" could put whatever they want
into it, but I think this is a case of "if it hurts, don't do it", and
we can ignore it.

-Peff
Jeff King March 19, 2019, 2:59 a.m. UTC | #4
On Mon, Mar 18, 2019 at 10:46:46PM -0400, Jeff King wrote:

> > Let's instead try to stay as close as possible to what asciidoc.conf
> > does. We'll make it fairly obvious that we aim to inject the exact same
> > three lines of `<refmiscinfo/>` that asciidoc.conf provides. The only
> > somewhat tricky part is that we inject them *post*-processing so we need
> > to do the variable expansion ourselves.
> 
> One thing that asciidoctor buys us that asciidoc does not is that we
> might eventually move to directly generating the manpages, without the
> XML / Docbook step in between. And if we do, then all of this XML
> hackery is going to have to get replaced with something else. I guess we
> can cross that bridge when we come to it.

I see there was some discussion of that in the side-thread, too. Just to
give my two cents: I think that's where we want to go eventually, but I
also think that while we are maintaining a dual asciidoc/asciidoctor
tool chain, it's probably crazy not to go through docbook, just because
it keeps so much of the pipeline the same for both tools.

So in my mind, the endgame is that we eventually drop asciidoc in favor
of asciidoctor. The repo at:

  https://github.com/asciidoc/asciidoc

says:

  NOTE: This implementation is written in Python 2, which EOLs in Jan
  2020. AsciiDoc development is being continued under @asciidoctor.

I'm not sure when is the right time to switch. If we can get the output
at parity, I don't think asciidoctor is too onerous to install (and we
don't have to worry about ancient platforms, as they can use
pre-formatted manpages).

-Peff
Jeff King March 19, 2019, 3:30 a.m. UTC | #5
On Mon, Mar 18, 2019 at 10:46:45PM -0400, Jeff King wrote:

> >   * Provide the `mansource` attribute to Asciidoctor. This attribute
> >     looks promising until one realizes that it can only be given inside
> >     the source file (the .txt file in our case), *not* on the command
> >     line using `-a mansource=foobar`. I toyed with the idea of injecting
> >     this attribute while feeding Asciidoctor the input on stdin, but it
> >     didn't feel like it was worth the complexity in the Makefile.
> 
> It does seem like "mansource" is the way asciidoctor expects us to do
> this. Why doesn't it work from the command line? Is it a bug in
> asciidoctor, or is there something more subtle going on?
> 
> I think even if it is a bug and gets fixed, though, it still wouldn't
> have the version field (though that seems like something we could
> contribute to asciidoctor).

I just tried with asciidoc 2.0.0.rc.2, which came out last week. It does
seem to work from the command line:

  $ make USE_ASCIIDOCTOR=Yes \
         ASCIIDOC_DOCBOOK=docbook5 \
         ASCIIDOC='asciidoctor -amansource=Git -amanmanual="Git Manual"' \
         git-add.xml
  $ sed -n '/refmeta/,/refmeta/p' git-add.xml
  <refmeta>
  <refentrytitle>git-add</refentrytitle>
  <manvolnum>1</manvolnum>
  <refmiscinfo class="source">Git</refmiscinfo>
  <refmiscinfo class="manual">Git Manual</refmiscinfo>
  </refmeta>

Still no "manversion" attribute, though.

-Peff
Junio C Hamano March 19, 2019, 3:55 a.m. UTC | #6
Jeff King <peff@peff.net> writes:

> So in my mind, the endgame is that we eventually drop asciidoc in favor
> of asciidoctor. The repo at:
>
>   https://github.com/asciidoc/asciidoc
>
> says:
>
>   NOTE: This implementation is written in Python 2, which EOLs in Jan
>   2020. AsciiDoc development is being continued under @asciidoctor.

;-)

> I'm not sure when is the right time to switch. If we can get the output
> at parity, I don't think asciidoctor is too onerous to install (and we
> don't have to worry about ancient platforms, as they can use
> pre-formatted manpages).

One minor thing that bothers me abit is the continuity of the
pre-formatted pages when I switch to asciidoctor to update them.

I do not mind having to see a huge diff in the "git log -p" output
run in pre-formatted manpages and htmldocs repositories at the
boundary due to e.g. the differences how lines are broken or folded
between the formatters, but by the time we have to transition, the
efforts by you, Martin and friends to allow us compare the formatted
docs would have made the real differences to empty (or at least
negligible).  Knock knock...
Martin Ågren March 19, 2019, 7:02 a.m. UTC | #7
On Sun, 17 Mar 2019 at 20:44, Todd Zullinger <tmz@pobox.com> wrote:
> Martin Ågren wrote:
> >   * Provide the `mansource` attribute to Asciidoctor. This attribute
> >     looks promising until one realizes that it can only be given inside
> >     the source file (the .txt file in our case), *not* on the command
> >     line using `-a mansource=foobar`. I toyed with the idea of injecting
> >     this attribute while feeding Asciidoctor the input on stdin, but it
> >     didn't feel like it was worth the complexity in the Makefile.
>
> I played with this direction before.  Using Asciidoctor we
> can convert directly from .txt to man without docbook
> and xmlto.  That does have some other issues which need to
> be worked out though.  Here's what I had as a start:

> +ASCIIDOC_EXTRA += -a mansource="Git $(GIT_VERSION)" -a manmanual="Git Manual"

So to be honest, I still don't understand how this works, but it does,
great. I really need to improve my documentation-reading skills.

> I munged up doc-diff to set MANDWIDTH=1000 and set one
> branch to default to asciidoctor to compare.  (Your other
> recent series looks like it'll make doing asciidoc and
> asciidoctor comparisons easier.)
>
> There were a number of differences that I didn't work
> through though.  Most importantly was the change in the
> links noted in the commit message.

I had some more time to look at this. Thanks for getting started on this
switch. A few things I noticed:

{litdd} now renders as &#x2d;&#x2d. We should find some other way to
produce '--'. This should then be a simple change, as we're already
providing this attribute inside an `ifdef USE_ASCIIDOCTOR`.

"+" becomes "&#43;". I didn't immediately find where we do that.

Backticks should result in monospace.

`./doc-diff HEAD^ HEAD` shows how several "git-\nfoo" become
"\ngit-foo", i.e., linkgit expansions are now treated as non-breaking.
That's arguably good, but it brings some noise to the diff. Maybe one
should try and see if it's possible to break that to have a nicer
diff, then remove that breakage in a follow-up commit. Or, if it's
possible to make "git-foo" non-breaking before the switch. Hmm, was this
why you increased MANWIDTH?

A double-space between sentences turns into a single space -- at least
in constructions such as "... to foobar. `git-foo` does ...". Not a
problem perhaps, but noise in the diff.

And I'm sure there's more lurking in that huge diff. Whether any of that
is significant or not is another matter. ;-)


Martin
Martin Ågren March 19, 2019, 7:10 a.m. UTC | #8
On Tue, 19 Mar 2019 at 03:46, Jeff King <peff@peff.net> wrote:
>
> On Sun, Mar 17, 2019 at 03:47:47PM +0100, Martin Ågren wrote:
>
> >  Cc Todd and Peff who had a brief exchange [1] a while ago. Apparently
> >  Todd saw this "[FIXME: source]" on Fedora but Peff did not on Debian.
> >  I'm on Ubuntu 18.04 and used to see this also on 16.04. I'm not sure
> >  what might make Debian so special here.
>
> I think it was just that my version of asciidoctor had
>
>   https://github.com/asciidoctor/asciidoctor/pull/2636
>
> and Todd's did not. However, mine still does not do the _right_ thing,
> because we didn't pass the right attributes in to asciidoctor. It just
> didn't print an ugly "FIXME". Looking at the XML, I have:
>
>   <refentrytitle>git-add</refentrytitle>
>   <manvolnum>1</manvolnum>
>   <refmiscinfo class="source">&#160;</refmiscinfo>
>   <refmiscinfo class="manual">&#160;</refmiscinfo>
>   </refmeta>
>
> So it's just an nbsp instead of the real content, and the "version"
> field is missing entirely.

Huh, yeah, that's a big improvement already.

> > That Asciidoctor ignores asciidoc.conf is nothing new. This is why we
> > implement the `linkgit:` macro in asciidoc.conf *and* in
> > asciidoctor-extensions.rb. Follow suit and provide these tags in
> > asciidoctor-extensions.rb, using a "postprocessor" extension.
>
> Yeah, that seems sensible overall. Some thoughts on your approach:
>
> >   * Provide the `mansource` attribute to Asciidoctor. This attribute
> >     looks promising until one realizes that it can only be given inside
> >     the source file (the .txt file in our case), *not* on the command
> >     line using `-a mansource=foobar`. I toyed with the idea of injecting
> >     this attribute while feeding Asciidoctor the input on stdin, but it
> >     didn't feel like it was worth the complexity in the Makefile.
>
> It does seem like "mansource" is the way asciidoctor expects us to do
> this. Why doesn't it work from the command line? Is it a bug in
> asciidoctor, or is there something more subtle going on?
>
> I think even if it is a bug and gets fixed, though, it still wouldn't
> have the version field (though that seems like something we could
> contribute to asciidoctor).

The bug is in my docs-reading, see below.

> >   * Considering the above abandoned ideas, it seems better to put any
> >     complexity inside asciidoctor-extensions.rb. It is after all
> >     supposed to be the "equivalent" of asciidoc.conf. I considered
> >     providing a "tree processor" extension and use it to set, e.g.,
> >     `mansource` mentioned above.
>
> This seems like the least bad option, at least for now. Your code does
> do a generic regex substitution. The promise of XML is that we're
> supposed to be able to do structured, robust transformations of the
> document. But my experience has been that the tooling is sufficiently
> difficult to work with that you just end up writing a regex.
>
> So I'm curious if you tried to use an actual XML parser (or god forbid,
> XSLT) to do the transformation. But if you spent more than 5 minutes on
> it and got disgusted, I wouldn't ask you to look deeper than that. :)

Well, I didn't spend 5 minutes on it, but my experience told me
something like that would happen. ;-)

Now I realize that I'm wrong in my "it doesn't work from the command
line". Somehow, I read the following in the user manual: "Many
attributes can only be defined in the document header (or via the API or
CLI)." And *repeatedly* read is as "(not via the API or CLI)", somehow
always expecting a "not" to follow an "only". Oh well.

But of course, I did try it out before reaching for the docs, like
anyone would. The true reason it doesn't work for me is probably this
header from the listing that contains "mansource": "Manpage attributes
(relevant only when using the manpage doctype and/or converter)".

> I doubt we'd see any other refmeta tags (and any non-tag content would
> be quoted).
>
> > Let's instead try to stay as close as possible to what asciidoc.conf
> > does. We'll make it fairly obvious that we aim to inject the exact same
> > three lines of `<refmiscinfo/>` that asciidoc.conf provides. The only
> > somewhat tricky part is that we inject them *post*-processing so we need
> > to do the variable expansion ourselves.
>
> One thing that asciidoctor buys us that asciidoc does not is that we
> might eventually move to directly generating the manpages, without the
> XML / Docbook step in between. And if we do, then all of this XML
> hackery is going to have to get replaced with something else. I guess we
> can cross that bridge when we come to it.

Todd has made a promising start in another part of this thread. There
seems to be a few wrinkles that need some care, but hopefully nothing
impossible (famous last words).

> The patch itself looks sane. Would we ever need to XML-quote the
> contents of git_version? I guess the asciidoc.conf version doesn't
> bother.

Good point. Hadn't thought of it. You're right that the asciidoc.conf
version has the same problem and a version string like "<>" goes
unescaped into the xml.

> Technically the user running "make" could put whatever they want
> into it, but I think this is a case of "if it hurts, don't do it", and
> we can ignore it.

:-)

Martin
Martin Ågren March 19, 2019, 7:12 a.m. UTC | #9
On Tue, 19 Mar 2019 at 04:30, Jeff King <peff@peff.net> wrote:
>
> On Mon, Mar 18, 2019 at 10:46:45PM -0400, Jeff King wrote:
>
> > It does seem like "mansource" is the way asciidoctor expects us to do
> > this. Why doesn't it work from the command line? Is it a bug in
> > asciidoctor, or is there something more subtle going on?
> >
> > I think even if it is a bug and gets fixed, though, it still wouldn't
> > have the version field (though that seems like something we could
> > contribute to asciidoctor).
>
> I just tried with asciidoc 2.0.0.rc.2, which came out last week. It does
> seem to work from the command line:
>
>   $ make USE_ASCIIDOCTOR=Yes \
>          ASCIIDOC_DOCBOOK=docbook5 \
>          ASCIIDOC='asciidoctor -amansource=Git -amanmanual="Git Manual"' \
>          git-add.xml
>   $ sed -n '/refmeta/,/refmeta/p' git-add.xml
>   <refmeta>
>   <refentrytitle>git-add</refentrytitle>
>   <manvolnum>1</manvolnum>
>   <refmiscinfo class="source">Git</refmiscinfo>
>   <refmiscinfo class="manual">Git Manual</refmiscinfo>
>   </refmeta>

No such luck with asciidoctor 1.5.5. Seems like it really wants
"manpage" before it considers these attributes.

(That's still me holding the tool, so factor that into it.)

Martin
Martin Ågren March 19, 2019, 7:33 a.m. UTC | #10
On Tue, 19 Mar 2019 at 04:55, Junio C Hamano <gitster@pobox.com> wrote:
>
> Jeff King <peff@peff.net> writes:
>
> > So in my mind, the endgame is that we eventually drop asciidoc in favor
> > of asciidoctor. The repo at:
> >
> >   https://github.com/asciidoc/asciidoc
> >
> > says:
> >
> >   NOTE: This implementation is written in Python 2, which EOLs in Jan
> >   2020. AsciiDoc development is being continued under @asciidoctor.
>
> ;-)
>
> > I'm not sure when is the right time to switch. If we can get the output
> > at parity, I don't think asciidoctor is too onerous to install (and we
> > don't have to worry about ancient platforms, as they can use
> > pre-formatted manpages).
>
> One minor thing that bothers me abit is the continuity of the
> pre-formatted pages when I switch to asciidoctor to update them.
>
> I do not mind having to see a huge diff in the "git log -p" output
> run in pre-formatted manpages and htmldocs repositories at the
> boundary due to e.g. the differences how lines are broken or folded
> between the formatters, but by the time we have to transition, the
> efforts by you, Martin and friends to allow us compare the formatted
> docs would have made the real differences to empty (or at least
> negligible).  Knock knock...

This might be a good spot to provide a bit of current status on this.

The patch under discussion here and 185f9a0ea0 ("asciidoctor-extensions:
fix spurious space after linkgit", 2019-02-27), reduce the diff
between asciidoc and asciidoctor considerably. Let's assume for the
moment that these patches or something like them enter master...

There's one larger difference in git-checkout.txt which I'm staying away
from at the moment, since Duy is doing a lot of work there at the time.
(He's not making that asciidoc/tor issue any worse, and he's not
spreading it to say git-switch.txt, so I'd rather just not rush to it.)
Let's assume I get around to this soonish...

Let's also assume that `./doc-diff --cut-header-footer --from-asciidoc
--to-asciidoctor HEAD HEAD` enters master [1]. Then what remains is a
fairly small and understandable diff where most issues are "yeah, I
guess that looks nicer" or even "they're just as fine", rather than "oh
wow, that's ugly". (IMHO.)

Martin

[1] You could already now run it to diff "master master", but you'd
trip on the Makefile thinking there's nothing to do. That'd be fixed by
9a71722b4d ("Doc: auto-detect changed build flags", 2019-03-17).
Martin Ågren March 19, 2019, 7:36 a.m. UTC | #11
Hi Junio

On Tue, 19 Mar 2019 at 08:33, Martin Ågren <martin.agren@gmail.com> wrote:
> between asciidoc and asciidoctor considerably. Let's assume for the
> moment that these patches or something like them enter master...

To be clear. *This* patch has a sufficiently incorrect commit message
that it really needs a makeover. You can expect a v2.

Martin
Jeff King March 19, 2019, 7:43 a.m. UTC | #12
On Tue, Mar 19, 2019 at 08:12:54AM +0100, Martin Ågren wrote:

> > I just tried with asciidoc 2.0.0.rc.2, which came out last week. It does
> > seem to work from the command line:
> >
> >   $ make USE_ASCIIDOCTOR=Yes \
> >          ASCIIDOC_DOCBOOK=docbook5 \
> >          ASCIIDOC='asciidoctor -amansource=Git -amanmanual="Git Manual"' \
> >          git-add.xml
> >   $ sed -n '/refmeta/,/refmeta/p' git-add.xml
> >   <refmeta>
> >   <refentrytitle>git-add</refentrytitle>
> >   <manvolnum>1</manvolnum>
> >   <refmiscinfo class="source">Git</refmiscinfo>
> >   <refmiscinfo class="manual">Git Manual</refmiscinfo>
> >   </refmeta>
> 
> No such luck with asciidoctor 1.5.5. Seems like it really wants
> "manpage" before it considers these attributes.
> 
> (That's still me holding the tool, so factor that into it.)

The refmiscinfo stuff didn't arrive until asciidoctor v1.5.7.

-Peff
Todd Zullinger March 20, 2019, 6:17 p.m. UTC | #13
Hi,

Martin Ågren wrote:
> On Sun, 17 Mar 2019 at 20:44, Todd Zullinger <tmz@pobox.com> wrote:
>> Martin Ågren wrote:
>> +ASCIIDOC_EXTRA += -a mansource="Git $(GIT_VERSION)" -a manmanual="Git Manual"
> 
> So to be honest, I still don't understand how this works, but it does,
> great. I really need to improve my documentation-reading skills.

Let me know if you find any good methods for perfect
retention.  I've re-read enough documentation for a
lifetime. ;)

> I had some more time to look at this. Thanks for getting started on this
> switch. A few things I noticed:
> 
> {litdd} now renders as &#x2d;&#x2d. We should find some other way to
> produce '--'. This should then be a simple change, as we're already
> providing this attribute inside an `ifdef USE_ASCIIDOCTOR`.

I noticed that one and didn't work out a good fix, but it
sounds like you have one in mind.  That's great.

> "+" becomes "&#43;". I didn't immediately find where we do that.

For this one, I was working on replacing "{plus}" with `+`
(along with " " and "-").  That's probably not ideal though.

This is what I had to test:

-- 8< --
Subject: [PATCH] WIP: wrap "[ +-]" in backticks (NOT FOR SUBMISSION)

asciidoctor's manpage output html-encodes "{plus}" which seems like a
bug.  At the least it needs some option I've yet to learn.
---
 Documentation/git-add.txt | 26 +++++++++++++-------------
 Documentation/gitweb.txt  |  6 +++---
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index 37bcab94d5..dc1dd3a91b 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -363,20 +363,20 @@ may see in a patch, and which editing operations make sense on them.
 --
 added content::
 
-Added content is represented by lines beginning with "{plus}". You can
+Added content is represented by lines beginning with `"+"`. You can
 prevent staging any addition lines by deleting them.
 
 removed content::
 
-Removed content is represented by lines beginning with "-". You can
-prevent staging their removal by converting the "-" to a " " (space).
+Removed content is represented by lines beginning with `"-"`. You can
+prevent staging their removal by converting the `"-"` to a `" "` (space).
 
 modified content::
 
-Modified content is represented by "-" lines (removing the old content)
-followed by "{plus}" lines (adding the replacement content). You can
-prevent staging the modification by converting "-" lines to " ", and
-removing "{plus}" lines. Beware that modifying only half of the pair is
+Modified content is represented by `"-"` lines (removing the old content)
+followed by `"+"` lines (adding the replacement content). You can
+prevent staging the modification by converting `"-"` lines to `" "`, and
+removing `"+"` lines. Beware that modifying only half of the pair is
 likely to introduce confusing changes to the index.
 --
 
@@ -393,29 +393,29 @@ Avoid using these constructs, or do so with extreme caution.
 removing untouched content::
 
 Content which does not differ between the index and working tree may be
-shown on context lines, beginning with a " " (space).  You can stage
-context lines for removal by converting the space to a "-". The
+shown on context lines, beginning with a `" "` (space).  You can stage
+context lines for removal by converting the space to a `"-"`. The
 resulting working tree file will appear to re-add the content.
 
 modifying existing content::
 
 One can also modify context lines by staging them for removal (by
-converting " " to "-") and adding a "{plus}" line with the new content.
-Similarly, one can modify "{plus}" lines for existing additions or
+converting `" "` to `"-"`) and adding a `"+"` line with the new content.
+Similarly, one can modify `"+"` lines for existing additions or
 modifications. In all cases, the new modification will appear reverted
 in the working tree.
 
 new content::
 
 You may also add new content that does not exist in the patch; simply
-add new lines, each starting with "{plus}". The addition will appear
+add new lines, each starting with `"+"`. The addition will appear
 reverted in the working tree.
 --
 
 There are also several operations which should be avoided entirely, as
 they will make the patch impossible to apply:
 
-* adding context (" ") or removal ("-") lines
+* adding context (`" "`) or removal (`"-"`) lines
 * deleting context or removal lines
 * modifying the contents of context or removal lines
 
diff --git a/Documentation/gitweb.txt b/Documentation/gitweb.txt
index 88450589af..d27f239242 100644
--- a/Documentation/gitweb.txt
+++ b/Documentation/gitweb.txt
@@ -80,15 +80,15 @@ continuation (newline escaping).
 * Leading and trailing whitespace are ignored.
 
 * Whitespace separated fields; any run of whitespace can be used as field
-separator (rules for Perl's "`split(" ", $line)`").
+separator (rules for Perl's "`split(`" "`, $line)`").
 
 * Fields use modified URI encoding, defined in RFC 3986, section 2.1
 (Percent-Encoding), or rather "Query string encoding" (see
 https://en.wikipedia.org/wiki/Query_string#URL_encoding[]), the difference
-being that SP (" ") can be encoded as "{plus}" (and therefore "{plus}" has to be
+being that SP (`" "`) can be encoded as `"+"` (and therefore `"+"` has to be
 also percent-encoded).
 +
-Reserved characters are: "%" (used for encoding), "{plus}" (can be used to
+Reserved characters are: "%" (used for encoding), `"+"` (can be used to
 encode SPACE), all whitespace characters as defined in Perl, including SP,
 TAB and LF, (used to separate fields in a record).

-- 8< --

> `./doc-diff HEAD^ HEAD` shows how several "git-\nfoo" become
> "\ngit-foo", i.e., linkgit expansions are now treated as non-breaking.
> That's arguably good, but it brings some noise to the diff. Maybe one
> should try and see if it's possible to break that to have a nicer
> diff, then remove that breakage in a follow-up commit. Or, if it's
> possible to make "git-foo" non-breaking before the switch. Hmm, was this
> why you increased MANWIDTH?

Yeah, I noticed a number of places where asciidoc and
asciidoctor wrapped lines at slightly different places.  I
didn't see if they were all due to wrapping at a dashed git
command, but that could certainly have been the main cause.

I set the large MANWIDTH and then I think I added
--color-words to the diff call in doc-diff.

> A double-space between sentences turns into a single space -- at least
> in constructions such as "... to foobar. `git-foo` does ...". Not a
> problem perhaps, but noise in the diff.
> 
> And I'm sure there's more lurking in that huge diff. Whether any of that
> is significant or not is another matter. ;-)

Oh my yes, I'm sure. :)
Todd Zullinger March 20, 2019, 6:32 p.m. UTC | #14
Jeff King wrote:
> On Tue, Mar 19, 2019 at 08:12:54AM +0100, Martin Ågren wrote:
> 
>>> I just tried with asciidoc 2.0.0.rc.2, which came out last week. It does
>>> seem to work from the command line:
>>>
>>>   $ make USE_ASCIIDOCTOR=Yes \
>>>          ASCIIDOC_DOCBOOK=docbook5 \
>>>          ASCIIDOC='asciidoctor -amansource=Git -amanmanual="Git Manual"' \
>>>          git-add.xml
>>>   $ sed -n '/refmeta/,/refmeta/p' git-add.xml
>>>   <refmeta>
>>>   <refentrytitle>git-add</refentrytitle>
>>>   <manvolnum>1</manvolnum>
>>>   <refmiscinfo class="source">Git</refmiscinfo>
>>>   <refmiscinfo class="manual">Git Manual</refmiscinfo>
>>>   </refmeta>
>> 
>> No such luck with asciidoctor 1.5.5. Seems like it really wants
>> "manpage" before it considers these attributes.
>> 
>> (That's still me holding the tool, so factor that into it.)
> 
> The refmiscinfo stuff didn't arrive until asciidoctor v1.5.7.

Now I don't feel as bad that I didn't find any good way to
handle refmiscinfo when I first tried to use asciidoctor in
November 2017 at least.

This made me look closer at the fedora asciidoctor packages.
They have been stuck on 1.5.6.1.  So all my recent testing
has been with 1.5.6.1.

I spent some time yesterday working toward getting the
fedora packages updated to 1.5.8.  With luck that will reach
the stable updates channels before too long.

(I'm guessing the upcoming 2.0 release won't be suitable as
an update for released fedora versions, being a major
release with potentially backwards-incompatible changes.)

There are differences in the output from 1.5.6.1 and 1.5.8,
but I haven't looked closely yet.
Martin Ågren March 22, 2019, 9:01 p.m. UTC | #15
On Wed, 20 Mar 2019 at 19:17, Todd Zullinger <tmz@pobox.com> wrote:
> Martin Ågren wrote:
> > {litdd} now renders as &#x2d;&#x2d. We should find some other way to
> > produce '--'. This should then be a simple change, as we're already
> > providing this attribute inside an `ifdef USE_ASCIIDOCTOR`.
>
> I noticed that one and didn't work out a good fix, but it
> sounds like you have one in mind.  That's great.
>
> > "+" becomes "&#43;". I didn't immediately find where we do that.
>
> For this one, I was working on replacing "{plus}" with `+`
> (along with " " and "-").  That's probably not ideal though.

The "plus" and "litdd" issues seem like they can be solved by doing:

  ASCIIDOC_EXTRA += -aplus='+'
  ASCIIDOC_EXTRA += -alitdd='\--'

> > `./doc-diff HEAD^ HEAD` shows how several "git-\nfoo" become
> > "\ngit-foo", i.e., linkgit expansions are now treated as non-breaking.
> > [...] Hmm, was this
> > why you increased MANWIDTH?
>
> Yeah, I noticed a number of places where asciidoc and
> asciidoctor wrapped lines at slightly different places.  I
> didn't see if they were all due to wrapping at a dashed git
> command, but that could certainly have been the main cause.
>
> I set the large MANWIDTH and then I think I added
> --color-words to the diff call in doc-diff.
>
> > A double-space between sentences turns into a single space -- at least
> > in constructions such as "... to foobar. `git-foo` does ...". Not a
> > problem perhaps, but noise in the diff.
> >
> > And I'm sure there's more lurking in that huge diff. Whether any of that
> > is significant or not is another matter. ;-)
>
> Oh my yes, I'm sure. :)

:)

Using origin/ma/doc-diff-doc-vs-doctor-comparison, you can do something
like

  ./doc-diff --asciidoctor --cut-header-footer foo bar --
--ignore-space-change --word-diff

to reduce the amount of noise in the diff by a great deal. Going through
the result, I noted two issues which should be fixed by Asciidoctor
1.5.7, according to some initial searching (but I haven't tested):

"--" renders as an em dash followed by a ";". Looks like it's
https://github.com/asciidoctor/asciidoctor/issues/2604

"&" turns into "&amp;". Seems to be known and fixed:
https://github.com/asciidoctor/asciidoctor/issues/2525

Plus a few other problems. There's the odd problem here and there, but
most of the diff seems to be a large number of occurences of just a few
issues.

It would probably be worthwhile to try 1.5.7+ to see how much that
improves things. Seems like you're already underway there.

Martin
Todd Zullinger March 23, 2019, 7:27 p.m. UTC | #16
Hi,

Martin Ågren wrote:
> On Wed, 20 Mar 2019 at 19:17, Todd Zullinger <tmz@pobox.com> wrote:
>> Martin Ågren wrote:
>>> {litdd} now renders as &#x2d;&#x2d. We should find some other way to
>>> produce '--'. This should then be a simple change, as we're already
>>> providing this attribute inside an `ifdef USE_ASCIIDOCTOR`.
>>
>> I noticed that one and didn't work out a good fix, but it
>> sounds like you have one in mind.  That's great.
>>
>>> "+" becomes "&#43;". I didn't immediately find where we do that.
>>
>> For this one, I was working on replacing "{plus}" with `+`
>> (along with " " and "-").  That's probably not ideal though.
> 
> The "plus" and "litdd" issues seem like they can be solved by doing:
> 
>   ASCIIDOC_EXTRA += -aplus='+'
>   ASCIIDOC_EXTRA += -alitdd='\--'

Hmm, setting litdd makes the html generate an em dash
followed by a zero width space (in 1.5.8 and 2.0.0)

I updated my system to asciidoctor-2.0.0 last night and now
I can't even generate the man pages properly, because the
docbook45 converter was removed.  I'll have to see if I
missed some other required update. :/

> It would probably be worthwhile to try 1.5.7+ to see how much that
> improves things. Seems like you're already underway there.

Yeah, before I knew how soon 2.0.0 was coming, I updated to
1.5.8 and built the various Fedora packages which require it
to see how they handled it.  Almost all of the changes were
fixes to bugs in previous versions.  And the one which was
not is likely to be fixed in 2.0.0 according to asciidoctor
maintainer Dan Allen.

Have you looked at diffing the html output as well?  It
seems like we'll need to check it as well to be sure any
fixes to the manpage output don't have a negative impact on
the html output, and vice versa.

I used 'links -dump' output for comparison of the various
Fedora packages which currently build with asciidoctor.
It's not perfect.  It could miss visual changes which might
be important when viewed in a graphical browser.  But it was
better than trying to diff the html files directly. :)

We probably also want to check the validity of links within
the docs, as one of the changes in 1.5.8 caused breakage of
cross interdocument references.  (This is the change I noted
above which should be fixed in 2.0.0.)

It'll be quite a while until most systems with asciidoctor
1.5.x are gone.  I doubt that upgrading to even 1.5.8 is
suitable for the currently released Fedora versions due to
incompatible changes.  I am going to try and get 2.0.0 into
Fedora 30, but the deadline for changes has just passed, so
I may not be able to do so.  If not, it'll be 6-8 months
until a released version of Fedora has an asciidoctor newer
than 1.5.6.1.

All that is just to say that even if newer asciidoctor fixes
many of the issues we've seen it will still be a long time
before we can reasonable default to asciidoctor or drop
asciidoc support.

For what it's worth, the Fedora asciidoc packages moved to
python3 using https://github.com/asciidoc/asciidoc-py3.
That's not very active, but it should at least keep asciidoc
alive beyond the approaching python2 EOL.
Jeff King March 24, 2019, 12:16 p.m. UTC | #17
On Sat, Mar 23, 2019 at 03:27:56PM -0400, Todd Zullinger wrote:

> I updated my system to asciidoctor-2.0.0 last night and now
> I can't even generate the man pages properly, because the
> docbook45 converter was removed.  I'll have to see if I
> missed some other required update. :/

I ran into that, too. Using ASCIIDOC_DOCBOOK=docbook5 worked. The output
looked OK to me, but I only eyeballed it briefly. It's possible there
are subtle problems.

-Peff
Todd Zullinger March 24, 2019, 4:21 p.m. UTC | #18
Hi,

Jeff King wrote:
> On Sat, Mar 23, 2019 at 03:27:56PM -0400, Todd Zullinger wrote:
> 
>> I updated my system to asciidoctor-2.0.0 last night and now
>> I can't even generate the man pages properly, because the
>> docbook45 converter was removed.  I'll have to see if I
>> missed some other required update. :/
> 
> I ran into that, too. Using ASCIIDOC_DOCBOOK=docbook5 worked. The output
> looked OK to me, but I only eyeballed it briefly. It's possible there
> are subtle problems.

Interesting.  I did that last night as well, but it only led
to more errors.  I'm curious why manpage builds work for you
and not for me.

On my fedora 29 test system, ASCIIDOC_DOCBOOK=docbook5 leads
to a validation failure from xmlto, since docbook5 doesn't
use a DTD¹.  I added XMLTO_EXTRA = --skip-validation to the
USE_ASCIIDOCTOR block, which can build many of the man
pages, but fails when it gets to git-blame due to use of
literal < > characters in the xml:

    git-blame.xml:423: parser error : StartTag: invalid element name
    <literallayout class="monospaced"><40-byte hex sha1> <sourceline> <resultline> <
                                       ^

While this may be a real issue in the formatting of
git-blame.txt which we should fix, I'm suspicious of that
due to the number of other files similarly affected:

    git-blame.txt
    git-diff-files.txt
    git-diff-index.txt
    git-diff-tree.txt
    git-diff.txt
    git-format-patch.txt
    git-http-fetch.txt
    git-log.txt
    git-rebase.txt
    git-rev-list.txt
    git-show.txt
    git-svn.txt

¹ Here's the failure I get without --skip-validation:
[tmz@f29 Documentation]$ make USE_ASCIIDOCTOR=1 git.1
    SUBDIR ../
make[1]: 'GIT-VERSION-FILE' is up to date.
    ASCIIDOC git.xml
    XMLTO git.1
xmlto: /home/tmz/src/git/Documentation/git.xml does not validate (status 3)
xmlto: Fix document syntax or use --skip-validation option
validity error : no DTD found!
Document /home/tmz/src/git/Documentation/git.xml does not validate
make: *** [Makefile:359: git.1] Error 13

Thanks,
Jeff King March 25, 2019, 3:06 p.m. UTC | #19
On Sun, Mar 24, 2019 at 12:21:31PM -0400, Todd Zullinger wrote:

> > I ran into that, too. Using ASCIIDOC_DOCBOOK=docbook5 worked. The output
> > looked OK to me, but I only eyeballed it briefly. It's possible there
> > are subtle problems.
> 
> Interesting.  I did that last night as well, but it only led
> to more errors.  I'm curious why manpage builds work for you
> and not for me.

I think it's because I'm an idiot. I must have only been using 2.0.0
when I was looking at the XML output manually (for the refmiscinfo
lines), and never actually rendered it to roff. I get the same problem
when I try a full build.

> On my fedora 29 test system, ASCIIDOC_DOCBOOK=docbook5 leads
> to a validation failure from xmlto, since docbook5 doesn't
> use a DTD¹.  I added XMLTO_EXTRA = --skip-validation to the
> USE_ASCIIDOCTOR block, which can build many of the man
> pages, but fails when it gets to git-blame due to use of
> literal < > characters in the xml:
> 
>     git-blame.xml:423: parser error : StartTag: invalid element name
>     <literallayout class="monospaced"><40-byte hex sha1> <sourceline> <resultline> <
>                                        ^

That seems like a bug in asciidoctor, which ought to be quoting the "<".
We certainly can't quote it ourselves (we don't even know that our
backend output is going to a format that needs angle brackets quoted).

-Peff
Todd Zullinger March 25, 2019, 7 p.m. UTC | #20
Hi,

Jeff King wrote:
> On Sun, Mar 24, 2019 at 12:21:31PM -0400, Todd Zullinger wrote:
>> I'm curious why manpage builds work for you and not for me.
> 
> I think it's because I'm an idiot. I must have only been using 2.0.0
> when I was looking at the XML output manually (for the refmiscinfo
> lines), and never actually rendered it to roff. I get the same problem
> when I try a full build.

Ahh.  I was hoping you'd tell me that I was a fool. :)

>> On my fedora 29 test system, ASCIIDOC_DOCBOOK=docbook5 leads
>> to a validation failure from xmlto, since docbook5 doesn't
>> use a DTD¹.  I added XMLTO_EXTRA = --skip-validation to the
>> USE_ASCIIDOCTOR block, which can build many of the man
>> pages, but fails when it gets to git-blame due to use of
>> literal < > characters in the xml:
>> 
>>     git-blame.xml:423: parser error : StartTag: invalid element name
>>     <literallayout class="monospaced"><40-byte hex sha1> <sourceline> <resultline> <
>>                                        ^
> 
> That seems like a bug in asciidoctor, which ought to be quoting the "<".
> We certainly can't quote it ourselves (we don't even know that our
> backend output is going to a format that needs angle brackets quoted).

Yep, it seems so.  I filed this upstream:

https://github.com/asciidoctor/asciidoctor/issues/3205

I updated to asciidoctor-2.0.1 this morning to test, in case
it was one of the issues fixed since the 2.0.0 release.
Alas, we're the first to hit it and report it.
Todd Zullinger March 27, 2019, 1:06 a.m. UTC | #21
Hi,

I wrote:
> Jeff King wrote:
>> That seems like a bug in asciidoctor, which ought to be quoting the "<".
>> We certainly can't quote it ourselves (we don't even know that our
>> backend output is going to a format that needs angle brackets quoted).
> 
> Yep, it seems so.  I filed this upstream:
> 
> https://github.com/asciidoctor/asciidoctor/issues/3205
> 
> I updated to asciidoctor-2.0.1 this morning to test, in case
> it was one of the issues fixed since the 2.0.0 release.
> Alas, we're the first to hit it and report it.

Dan Allen fixed this upstream and released 2.0.2 today.
It's very good to know that asciidoctor upstream is
incredibly responsive.  If anyone runs into Dan at a
conference, please buy him a beer. ;)

There's still the matter of 2.0 dropping docbook45.  I'll
try to get around to testing 1.5.x releases with docbook5 to
see if they work reasonably well.  If not, we can add a
version check and set ASCIIDOC_DOCBOOK appropriately.

One other issue that crops up with docbook5 is that the
xmlto toolchain now outputs:

    Note: namesp. cut : stripped namespace before processing

as documented at

    https://docbook.org/docs/howto/howto.html#dbxsl

It's mostly an annoyance which we either want to strip out,
or pass an alternate docbook5 xsl, I think.  But I'm not
very familiar with the guts of the xml/docbook toolchains.
SZEDER Gábor March 27, 2019, 10:05 a.m. UTC | #22
On Tue, Mar 26, 2019 at 09:06:03PM -0400, Todd Zullinger wrote:
> Dan Allen fixed this upstream and released 2.0.2 today.
> It's very good to know that asciidoctor upstream is
> incredibly responsive.  If anyone runs into Dan at a
> conference, please buy him a beer. ;)

I noticed the "Release beer" lines in the Asciidoctor relnotes... :)

> One other issue that crops up with docbook5 is that the
> xmlto toolchain now outputs:
> 
>     Note: namesp. cut : stripped namespace before processing
> 
> as documented at
> 
>     https://docbook.org/docs/howto/howto.html#dbxsl
> 
> It's mostly an annoyance which we either want to strip out,
> or pass an alternate docbook5 xsl, I think.  But I'm not
> very familiar with the guts of the xml/docbook toolchains.

In our documentation CI build jobs we check that nothing is written to
Asciidoc/Asciidoctor's standard error [1].  These "Note:" lines go to
stderr, and will trigger that check and cause the build to fail.  So
wither we should find a way to silence these notes, or filter them out
in the CI builds.


[1] 505ad91304 (travis-ci: check AsciiDoc/AsciiDoctor stderr output,
    2017-04-26), though it never actually worked as intended, but that
    is about to get fixed:

    https://public-inbox.org/git/20190324215534.9495-7-szeder.dev@gmail.com/
brian m. carlson March 28, 2019, 12:06 a.m. UTC | #23
On Tue, Mar 26, 2019 at 09:06:03PM -0400, Todd Zullinger wrote:
> There's still the matter of 2.0 dropping docbook45.  I'll
> try to get around to testing 1.5.x releases with docbook5 to
> see if they work reasonably well.  If not, we can add a
> version check and set ASCIIDOC_DOCBOOK appropriately.
> 
> One other issue that crops up with docbook5 is that the
> xmlto toolchain now outputs:
> 
>     Note: namesp. cut : stripped namespace before processing
> 
> as documented at
> 
>     https://docbook.org/docs/howto/howto.html#dbxsl
> 
> It's mostly an annoyance which we either want to strip out,
> or pass an alternate docbook5 xsl, I think.  But I'm not
> very familiar with the guts of the xml/docbook toolchains.

I'm quite familiar with this area, because I used DocBook and its
stylesheets for my college papers. There are two sets of stylesheets,
the namespaced ones (for DocBook 5) and the non-namespaced ones (for
DocBook 4). They can be distinguished because the URLs (and typically
the paths) use "xsl" (for non-namespaced) or "xsl-ns" (for namespaced).

xmlto by default uses the older ones, and assuming you have a reasonably
capable XSLT processor, either set of stylesheets can be used, since
they simply transform the document into the right type (although with a
warning, as you've noticed).

Unfortunately, xmlto is hard-coded to use the non-namespaced stylesheets
with no option to specify which to use, and it doesn't appear to be
actively developed. There's an option to specify the stylesheet (-x),
but it can't take a URL, since xmlto "helpfully" fully qualifies the
path. That means we'd need to know the location on disk of those
stylesheets in order to use them, since we can't rely on catalog
resolution.
Jeff King March 28, 2019, 2:54 a.m. UTC | #24
On Tue, Mar 26, 2019 at 09:06:03PM -0400, Todd Zullinger wrote:

> > I updated to asciidoctor-2.0.1 this morning to test, in case
> > it was one of the issues fixed since the 2.0.0 release.
> > Alas, we're the first to hit it and report it.
> 
> Dan Allen fixed this upstream and released 2.0.2 today.
> It's very good to know that asciidoctor upstream is
> incredibly responsive.  If anyone runs into Dan at a
> conference, please buy him a beer. ;)

Cool. I've interacted a few times with the asciidoctor project due to
our use on git-scm.com, and have always found them pleasant and
responsive.

> There's still the matter of 2.0 dropping docbook45.  I'll
> try to get around to testing 1.5.x releases with docbook5 to
> see if they work reasonably well.  If not, we can add a
> version check and set ASCIIDOC_DOCBOOK appropriately.

With the hacky patch below, I was able to doc-diff the v1.5.8 versus
v2.0.2 output. It looks like there are quite a few cosmetic whitespace
changes. Some probably intentional (extra blank between terms and
paragraphs in a definition list) and some probably not (extra whitespace
before start of content in a bulleted list).

diff --git a/Documentation/doc-diff b/Documentation/doc-diff
index 3355be4798..4c2dccf516 100755
--- a/Documentation/doc-diff
+++ b/Documentation/doc-diff
@@ -17,9 +17,11 @@ f			force rebuild; do not rely on cached results
 c,clean			cleanup temporary working files
 from-asciidoc		use asciidoc with the 'from'-commit
 from-asciidoctor	use asciidoctor with the 'from'-commit
+from-asciidoctor2	use asciidoctor2 with the 'from'-commit
 asciidoc		use asciidoc with both commits
 to-asciidoc		use asciidoc with the 'to'-commit
 to-asciidoctor		use asciidoctor with the 'to'-commit
+to-asciidoctor2		use asciidoctor with the 'to'-commit
 asciidoctor		use asciidoctor with both commits
 cut-header-footer	cut away header and footer
 "
@@ -55,6 +57,10 @@ do
 	--asciidoc)
 		from_program=-asciidoc
 		to_program=-asciidoc ;;
+	--from-asciidoctor2)
+		from_program=-asciidoctor2 ;;
+	--to-asciidoctor2)
+		to_program=-asciidoctor2 ;;
 	--cut-header-footer)
 		cut_header_footer=-cut-header-footer ;;
 	--)
@@ -112,6 +118,13 @@ construct_makemanflags () {
 	elif test "$1" = "-asciidoctor"
 	then
 		echo USE_ASCIIDOCTOR=YesPlease
+	elif test "$1" = "-asciidoctor2"
+	then
+		echo USE_ASCIIDOCTOR=YesPlease
+		echo ASCIIDOC_DOCBOOK=docbook5
+		echo XMLTO_EXTRA=--skip-validation
+		# not really portable :)
+		echo ASCIIDOCTOR=/tmp/run-asciidoctor-2
 	fi
 }
 
@@ -182,6 +195,6 @@ render_tree () {
 	fi
 }
 
-render_tree $from_oid $from_dir $from_makemanflags &&
-render_tree $to_oid $to_dir $to_makemanflags &&
+render_tree $from_oid $from_dir "$from_makemanflags" &&
+render_tree $to_oid $to_dir "$to_makemanflags" &&
 git -C $tmp/rendered diff --no-index "$@" $from_dir $to_dir
Jeff King March 28, 2019, 3:33 a.m. UTC | #25
On Wed, Mar 27, 2019 at 10:54:34PM -0400, Jeff King wrote:

> @@ -112,6 +118,13 @@ construct_makemanflags () {
>  	elif test "$1" = "-asciidoctor"
>  	then
>  		echo USE_ASCIIDOCTOR=YesPlease
> +	elif test "$1" = "-asciidoctor2"
> +	then
> +		echo USE_ASCIIDOCTOR=YesPlease
> +		echo ASCIIDOC_DOCBOOK=docbook5
> +		echo XMLTO_EXTRA=--skip-validation
> +		# not really portable :)
> +		echo ASCIIDOCTOR=/tmp/run-asciidoctor-2
>  	fi

Oops, this last line should be ASCIIDOC=..., of course. Which meant that
I was actually just testing the docbook5 output from v1.5.8. The results
with v2.0.2 seem similar, though.

-Peff
Todd Zullinger March 30, 2019, 6 p.m. UTC | #26
Hi,

brian m. carlson wrote:
> On Tue, Mar 26, 2019 at 09:06:03PM -0400, Todd Zullinger wrote:
>> There's still the matter of 2.0 dropping docbook45.  I'll
>> try to get around to testing 1.5.x releases with docbook5 to
>> see if they work reasonably well.  If not, we can add a
>> version check and set ASCIIDOC_DOCBOOK appropriately.
>> 
>> One other issue that crops up with docbook5 is that the
>> xmlto toolchain now outputs:
>> 
>>     Note: namesp. cut : stripped namespace before processing
>> 
>> as documented at
>> 
>>     https://docbook.org/docs/howto/howto.html#dbxsl
>> 
>> It's mostly an annoyance which we either want to strip out,
>> or pass an alternate docbook5 xsl, I think.  But I'm not
>> very familiar with the guts of the xml/docbook toolchains.
> 
> I'm quite familiar with this area, because I used DocBook and its
> stylesheets for my college papers. There are two sets of stylesheets,
> the namespaced ones (for DocBook 5) and the non-namespaced ones (for
> DocBook 4). They can be distinguished because the URLs (and typically
> the paths) use "xsl" (for non-namespaced) or "xsl-ns" (for namespaced).
> 
> xmlto by default uses the older ones, and assuming you have a reasonably
> capable XSLT processor, either set of stylesheets can be used, since
> they simply transform the document into the right type (although with a
> warning, as you've noticed).
> 
> Unfortunately, xmlto is hard-coded to use the non-namespaced stylesheets
> with no option to specify which to use, and it doesn't appear to be
> actively developed. There's an option to specify the stylesheet (-x),
> but it can't take a URL, since xmlto "helpfully" fully qualifies the
> path. That means we'd need to know the location on disk of those
> stylesheets in order to use them, since we can't rely on catalog
> resolution.

Thanks for the very useful docbook5/xmlto details!

The hard-coded use of the non-namespaced stylesheets in
xmlto is unfortunate.  I hadn't gotten far enough along to
run into that limitation of xmlto, so many thanks for saving
me from finding it myself.  I'm sure it would have taken me
far longer.

If it turns out that docbook5 causes us more pain than it's
worth, I suppose we could choose to use the builtin manpage
backend when using asciidoctor >= 2.

Or we could see about adding a docbook45 converter, which
upstream noted as a possibility in the ticket¹ which dropped
docbook45 by default.

It'll be some time before we can reasonably expect to only
support asciidoctor-2.x.  We'll have to see what method
involves the least ugly hacks to support asciidoc and the
various 1.5.x and 2.x versions of asciidoctor.

¹ https://github.com/asciidoctor/asciidoctor/issues/3005
brian m. carlson March 30, 2019, 9:04 p.m. UTC | #27
On Sat, Mar 30, 2019 at 02:00:14PM -0400, Todd Zullinger wrote:
> Thanks for the very useful docbook5/xmlto details!
> 
> The hard-coded use of the non-namespaced stylesheets in
> xmlto is unfortunate.  I hadn't gotten far enough along to
> run into that limitation of xmlto, so many thanks for saving
> me from finding it myself.  I'm sure it would have taken me
> far longer.
> 
> If it turns out that docbook5 causes us more pain than it's
> worth, I suppose we could choose to use the builtin manpage
> backend when using asciidoctor >= 2.

I suspect this will be the easiest way forward.  If we produce
semantically equivalent manpages, then this is also a lot nicer for
people who would prefer not to have a full XML toolchain installed.

> Or we could see about adding a docbook45 converter, which
> upstream noted as a possibility in the ticket¹ which dropped
> docbook45 by default.

Another possibility, depending on how responsive the xmlto upstream is,
is to add some sort of DocBook 5 support to it. This shouldn't be
terribly difficult, although we'd have to disable validation. DocBook 5
uses RELAX NG, and libxml2/xmllint's support for this has some bugs,
such that it will mark some valid documents using complex interleaves as
invalid. Still, if this is the direction we want to go, I have no
problem adding this support.

Since we'd have a quite new Asciidoctor and a new xmlto, distros should
have both around the same time.
Todd Zullinger April 5, 2019, 2:17 a.m. UTC | #28
brian m. carlson wrote:
> On Sat, Mar 30, 2019 at 02:00:14PM -0400, Todd Zullinger wrote:
>> Thanks for the very useful docbook5/xmlto details!
>> 
>> The hard-coded use of the non-namespaced stylesheets in
>> xmlto is unfortunate.  I hadn't gotten far enough along to
>> run into that limitation of xmlto, so many thanks for saving
>> me from finding it myself.  I'm sure it would have taken me
>> far longer.
>> 
>> If it turns out that docbook5 causes us more pain than it's
>> worth, I suppose we could choose to use the builtin manpage
>> backend when using asciidoctor >= 2.
> 
> I suspect this will be the easiest way forward.  If we produce
> semantically equivalent manpages, then this is also a lot nicer for
> people who would prefer not to have a full XML toolchain installed.

It's a good end goal.  There are a number of differences in
the output when using the manpage backend directly verus
docbook and then xmlto.  The way links to external html
documents are presented is the main one which comes to mind.
Maybe we can adjust that via asciidoctor-extensions.rb or
something, I don't know.

Elsewhere in this thread, Jeff made the very valid point
that we're probably wise to keep using the docbook/xmlto
chain as long as we're supporting both asciidoc and
asciidoctor.  Unless it turns out that it's more work to
coax asciidoctor (and the various 1.5 and 2.0 releases in
common use) to work with that same docbook/xmlto chain than
it is to do it directly, that is.

>> Or we could see about adding a docbook45 converter, which
>> upstream noted as a possibility in the ticket¹ which dropped
>> docbook45 by default.
> 
> Another possibility, depending on how responsive the xmlto upstream is,
> is to add some sort of DocBook 5 support to it. This shouldn't be
> terribly difficult, although we'd have to disable validation. DocBook 5
> uses RELAX NG, and libxml2/xmllint's support for this has some bugs,
> such that it will mark some valid documents using complex interleaves as
> invalid. Still, if this is the direction we want to go, I have no
> problem adding this support.
> 
> Since we'd have a quite new Asciidoctor and a new xmlto, distros should
> have both around the same time.

Good point.  It can't hurt to push for improvements across
the tools.  (Well, other than time being a limited resource
and time you may spend on doc tools being time you can't
spend on the hash conversion, which I imagine is a more
interesting project.)

Thanks,
Jeff King April 5, 2019, 6:46 p.m. UTC | #29
On Thu, Apr 04, 2019 at 10:17:21PM -0400, Todd Zullinger wrote:

> Elsewhere in this thread, Jeff made the very valid point
> that we're probably wise to keep using the docbook/xmlto
> chain as long as we're supporting both asciidoc and
> asciidoctor.  Unless it turns out that it's more work to
> coax asciidoctor (and the various 1.5 and 2.0 releases in
> common use) to work with that same docbook/xmlto chain than
> it is to do it directly, that is.

One of my secret (maybe not so secret?) implications there was that it
might be worth dropping asciidoc support sooner rather than later. I.e.,
if it is a burden to make it work with both old and new systems, then
let's make the jump to having it work with the new system.

IMHO we can be a bit more cavalier with saying "you must have a
recent-ish asciidoctor to build the docs", because it's so easy for us
to provide a binary distribution of the built HTML and manpages (in
fact, we already do so for the install-man-quick target).

So it doesn't really leave any platforms out in the cold; it just means
they have to tweak their build procedure.

-Peff
diff mbox series

Patch

diff --git a/Documentation/asciidoctor-extensions.rb b/Documentation/asciidoctor-extensions.rb
index 0089e0cfb8..059279dee1 100644
--- a/Documentation/asciidoctor-extensions.rb
+++ b/Documentation/asciidoctor-extensions.rb
@@ -20,9 +20,25 @@  module Git
         end
       end
     end
+
+    class DocumentPostProcessor < Asciidoctor::Extensions::Postprocessor
+      def process document, output
+        if document.basebackend? 'docbook'
+          git_version = document.attributes['git_version']
+          replacement = "" \
+            "<refmiscinfo class=\"source\">Git</refmiscinfo>\n" \
+            "<refmiscinfo class=\"version\">#{git_version}</refmiscinfo>\n" \
+            "<refmiscinfo class=\"manual\">Git Manual</refmiscinfo>\n" \
+            "<\/refmeta>"
+          output = output.sub(/<\/refmeta>/, replacement)
+        end
+        output
+      end
+    end
   end
 end
 
 Asciidoctor::Extensions.register do
   inline_macro Git::Documentation::LinkGitProcessor, :linkgit
+  postprocessor Git::Documentation::DocumentPostProcessor
 end