mbox series

[v3,00/15,RFC] Upstreaming the Scalar command

Message ID pull.1005.v3.git.1631129086.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Upstreaming the Scalar command | expand

Message

Philippe Blain via GitGitGadget Sept. 8, 2021, 7:24 p.m. UTC
tl;dr: This series contributes the Scalar command to the Git project. This
command provides an opinionated way to create and configure repositories
with a focus on very large repositories.


Background
==========

Years ago, Microsoft wanted to move the source code of the Windows operating
system to Git. The challenge there was to prove that Git could scale to
massive monorepos. The VFS for Git (formerly GVFS) project was born to take
up that challenge.

The final solution included a virtual filesystem (with both user-mode and
kernel components) and a customized fork of Git for Windows. This solution
contained several key concepts, such as only populating a portion of the
working directory, demand-fetching blobs, and performing periodic repo
maintenance in the background. However, the required kernel drivers made it
difficult to port the solution to other platforms.

But it was realized that many of these key concepts were independent of the
actual VFS and its projection of the working directory. The Scalar project
was created to make that separation, refine the key concepts, and then
extract those features into the new Scalar command.


The present
===========

The Scalar project provides a completely functional non-virtual experience
for monorepos. But why stop there. The Scalar project was designed to be a
self-destructing vehicle to allow those key concepts to be moved into core
Git itself for the benefit of all. For example, partial clone,
sparse-checkout, and background maintenance have already been upstreamed and
removed from Scalar proper. This patch series provides a C-based
implementation of the final remaining portions of the Scalar command. This
will make it easier for users to experiment with the Scalar command. It will
also make it substantially easier to experiment with moving functionality
from Scalar into core Git, while maintaining backwards-compatibility for
existing Scalar users.

The C-based Scalar has been shipped to Scalar users, and can be tested by
any interested reader:
https://github.com/microsoft/git/releases/tag/v2.33.0.vfs.0.0 (it offers a
Git for Windows installer, a macOS package and an Ubuntu package).


Opportunities
=============

Apart from providing the Scalar command, this contribution is intended to
serve as a basis for further mailing list discussions on moving (some of)
these key concepts into the main Git commands.

For example, we previously discussed the idea of a "git big-clone" that does
much of what "scalar clone" is doing. This patch series is a step to make
such functionality exist in the Git code base while we simmer on what such a
"git big-clone" command-line interface would look like.

This is one of many possible ways to do this. Creating a 'git big-clone'
could lock Git into backwards compatibility concerns so it is necessary to
approach such an endeavor with caution. As a discussion starter, the scalar
clone <url> command does roughly this:

 1. git clone --sparse --filter=blob:none /src
 2. git -C /src sparse-checkout init --cone
 3. git -C /src config (many times)
 4. git -C /src maintenance start

It is my hope inspire discussions about what parts of Scalar could go into
core Git, and where, and in which form. While we wish to maintain
backwards-compatibility of Scalar's command-line interface (because it is
already in use), by having the Scalar code in the same code base as Git's,
it will be much easier to move functionality without having to maintain
loose version coupling between independently-versioned Scalar and Git. The
tight version-coupling, along with having access to libgit.a also allows the
C-based implementation of Scalar to be much smaller than the original .NET
version.

For example, we might choose in the future to implement, say, git clone
--scale=partial,cone to initialize a partial clone with a cone-sparse
checkout, that would not only be totally doable, and not only would we
already have precedent and data to prove that this actually makes engineers
happy who have to work on ginormous repositories, but we could then also
implement it by moving parts of contrib/scalar/ to builtin/ (where
contrib/scalar/ would then call the built-ins accordingly rather than
hard-coding the defaults itself).

We now also have the opportunity to discuss the merits of Scalar's clone
caching, which is not actually part of this patch series because it is a bit
coupled with the GVFS parts of microsoft/git for the moment, where clones
automatically get registered with a populated alternate repository that is
identified by the URL, meaning: subsequent clones of the same repository are
vastly faster than the first one because they do not actually download the
already-received objects again, they access the cache instead.

Another thing that I could imagine to be discussed at length is the
distinction between enlistment and worktree (where the latter is the actual
Git worktree and usually lives in the src/ subdirectory of the former). This
encourages untracked and ignored files to be placed outside the worktree,
making Git's job much easier. This idea, too, might find its way in one way
or another into Git proper.

These are just a few concepts in Scalar that do not yet have equivalents in
Git. By putting this initial implementation into contrib/, we create a
foundation for future discussions of these concepts.

We plan on updating the recommended config settings in scalar register as
new Git features are available (such as builtin FSMonitor and sparse-index,
when ready). To facilitate upgrading existing Scalar enlistments, their
paths are automatically added to the [scalar] section of the global Git
config, and the scalar reconfigure --all command will process all of them.


Epilogue
========

Now, to address some questions that I imagine every reader has who made it
this far:

 * Why not put the Scalar functionality directly into a built-in? Creating a
   Git builtin requires scrutiny over every aspect of the feature, which is
   difficult to do while also maintaining the command-line interface
   contract and expected behavior of the Scalar command (there are existing
   users, after all). By having the Scalar command in contrib/, we present a
   simple option for users to have these features in the short term while
   the Git contributor community decides which bits to absorb into Git
   built-ins.
 * Why implement the Scalar command in the Git codebase? We ported Scalar to
   the microsoft/git fork for several reasons. First, we realized it was
   possible now that the core features exist inside Git itself. Second,
   compiling Scalar directly within a version of Git allows us to remove a
   version compatibility check from each config option that might or might
   not apply based on the installed Git version. Finally, this new location
   has greatly simplified our release process and the installation process
   for users. We now have ways to install Scalar with microsoft/git via
   winget, brew, and apt-get. This has been the case since we shipped
   v2.32.0 to our users, read: this setup has served us well already.
 * Why contribute Scalar to the Git project? We are biased, of course, yet
   we do have evidence that the Scalar command is a helpful tool that offers
   an simple way to handle huge repositories with ease. By contributing it
   to the core Git project, we are able to share it with more users,
   especially some users who do not want to install the microsoft/git fork.
   We intend to include Scalar as a component in git-for-windows/git, but
   are contributing it here first. Further, we think there is benefit to the
   Git developer community as this presents an example of how to set certain
   defaults that work for large repositories.
 * Does this integrate with the built-in FSMonitor yet? No, not yet. I do
   have a couple of add-on patch series lined up, one of them being the
   integration with the built-in FSMonitor, which obviously has to wait
   until the FSMonitor patch series advances further.

Changes since v2:

 * Adjusted the description of the list command in the manual page , as
   suggested by Bagas.
 * Addressed two style nits in cmd_run().
 * The documentation of git reconfigure -a was improved.

Changes since v1:

 * A couple typos were fixed
 * The code parsing the output of ls-remote was made more readable
 * The indentation used in scalar.txt now consistently uses tabs
 * We no longer hard-code core.bare = false when registering with Scalar

Derrick Stolee (4):
  scalar: 'register' sets recommended config and starts maintenance
  scalar: 'unregister' stops background maintenance
  scalar: implement 'scalar list'
  scalar: implement the `run` command

Johannes Schindelin (10):
  scalar: create a rudimentary executable
  scalar: start documenting the command
  scalar: create test infrastructure
  scalar: let 'unregister' handle a deleted enlistment directory
    gracefully
  scalar: implement the `clone` subcommand
  scalar: teach 'clone' to support the --single-branch option
  scalar: allow reconfiguring an existing enlistment
  scalar: teach 'reconfigure' to optionally handle all registered
    enlistments
  scalar: implement the `version` command
  scalar: accept -C and -c options before the subcommand

Matthew John Cheetham (1):
  scalar: implement the `delete` command

 Makefile                         |   8 +
 contrib/scalar/.gitignore        |   5 +
 contrib/scalar/Makefile          |  57 +++
 contrib/scalar/scalar.c          | 844 +++++++++++++++++++++++++++++++
 contrib/scalar/scalar.txt        | 154 ++++++
 contrib/scalar/t/Makefile        |  78 +++
 contrib/scalar/t/t9099-scalar.sh |  88 ++++
 7 files changed, 1234 insertions(+)
 create mode 100644 contrib/scalar/.gitignore
 create mode 100644 contrib/scalar/Makefile
 create mode 100644 contrib/scalar/scalar.c
 create mode 100644 contrib/scalar/scalar.txt
 create mode 100644 contrib/scalar/t/Makefile
 create mode 100755 contrib/scalar/t/t9099-scalar.sh


base-commit: ebf3c04b262aa27fbb97f8a0156c2347fecafafb
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1005%2Fdscho%2Fscalar-the-beginning-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1005/dscho/scalar-the-beginning-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1005

Range-diff vs v2:

  1:  b8c7d3f8450 =  1:  b8c7d3f8450 scalar: create a rudimentary executable
  2:  4f886575dcf =  2:  4f886575dcf scalar: start documenting the command
  3:  bcfde9bc765 =  3:  bcfde9bc765 scalar: create test infrastructure
  4:  ee3e26a0c4e =  4:  ee3e26a0c4e scalar: 'register' sets recommended config and starts maintenance
  5:  6142f75875b =  5:  6142f75875b scalar: 'unregister' stops background maintenance
  6:  82dd253154f =  6:  82dd253154f scalar: let 'unregister' handle a deleted enlistment directory gracefully
  7:  fb7c931ddb3 !  7:  d291d3723a6 scalar: implement 'scalar list'
     @@ contrib/scalar/scalar.txt: an existing Git worktree with Scalar whose name is no
      +~~~~
      +
      +list::
     -+	To see which repositories are currently registered by the service, run
     -+	`scalar list`. This subcommand does not need to be run inside a Scalar
     -+	enlistment.
     ++	List enlistments that are currently registered by Scalar. This
     ++	subcommand does not need to be run inside an enlistment.
      +
       Register
       ~~~~~~~~
  8:  f3223c10788 !  8:  40dbf61771e scalar: implement the `clone` subcommand
     @@ contrib/scalar/scalar.txt: an existing Git worktree with Scalar whose name is no
       List
       ~~~~
       
     - list::
     - 	To see which repositories are currently registered by the service, run
     --	`scalar list`. This subcommand does not need to be run inside a Scalar
     --	enlistment.
     -+	`scalar list`. This subcommand, like `clone`, does not need to be run
     -+	inside a Scalar enlistment.
     - 
     - Register
     - ~~~~~~~~
      @@ contrib/scalar/scalar.txt: unregister [<enlistment>]::
       
       SEE ALSO
  9:  b3c4b3dccc6 =  9:  414dbe7d859 scalar: teach 'clone' to support the --single-branch option
 10:  b7fc2dc29c8 ! 10:  76de416a643 scalar: implement the `run` command
     @@ contrib/scalar/scalar.c: static int cmd_register(int argc, const char **argv)
      +	argc = parse_options(argc, argv, NULL, options,
      +			     usagestr, 0);
      +
     -+	if (argc == 0)
     ++	if (!argc)
      +		usage_with_options(usagestr, options);
      +
     -+	if (!strcmp("all", argv[0]))
     ++	if (!strcmp("all", argv[0])) {
      +		i = -1;
     -+	else {
     ++	} else {
      +		for (i = 0; tasks[i].arg && strcmp(tasks[i].arg, argv[0]); i++)
      +			; /* keep looking for the task */
      +
 11:  9a834c23d08 = 11:  655a902b9df scalar: allow reconfiguring an existing enlistment
 12:  79e9f5d203a ! 12:  2d1987bfcda scalar: teach 'reconfigure' to optionally handle all registered enlistments
     @@ contrib/scalar/scalar.txt: After a Scalar upgrade, or when the configuration of
       reconfigure the enlistment.
       
      +With the `--all` option, all enlistments currently registered with Scalar
     -+will be reconfigured. This option is meant to to be run every time after
     -+Scalar is upgraded.
     ++will be reconfigured. Use this option after each Scalar upgrade.
      +
       SEE ALSO
       --------
 13:  94a21982652 ! 13:  c67938299ee scalar: implement the `delete` command
     @@ contrib/scalar/scalar.txt: scalar register [<enlistment>]
       
       DESCRIPTION
       -----------
     -@@ contrib/scalar/scalar.txt: With the `--all` option, all enlistments currently registered with Scalar
     - will be reconfigured. This option is meant to to be run every time after
     - Scalar is upgraded.
     +@@ contrib/scalar/scalar.txt: reconfigure the enlistment.
     + With the `--all` option, all enlistments currently registered with Scalar
     + will be reconfigured. Use this option after each Scalar upgrade.
       
      +Delete
      +~~~~~~
 14:  707d8e19683 = 14:  d2cd2b7094b scalar: implement the `version` command
 15:  26e23b5c5e5 = 15:  7ccc4f8b9b0 scalar: accept -C and -c options before the subcommand

Comments

Ævar Arnfjörð Bjarmason Sept. 9, 2021, 10:14 a.m. UTC | #1
On Wed, Sep 08 2021, Johannes Schindelin via GitGitGadget wrote:

> Changes since v2:
>
>  * Adjusted the description of the list command in the manual page , as
>    suggested by Bagas.
>  * Addressed two style nits in cmd_run().
>  * The documentation of git reconfigure -a was improved.
>
> Changes since v1:
>
>  * A couple typos were fixed
>  * The code parsing the output of ls-remote was made more readable
>  * The indentation used in scalar.txt now consistently uses tabs
>  * We no longer hard-code core.bare = false when registering with Scalar

In the summary I had on v1->v2 points 1-3 are for v2->v3, respectively,
outstanding, addressed, outstanding:

    https://lore.kernel.org/git/877dfupl7o.fsf@evledraar.gmail.com/

In addition the discussion ending here:
https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109082112270.55@tvgsbejvaqbjf.bet/

For that point: I think it's fair enough not to properly handle the
cleanup case in "scalar clone", but perhaps add a note in the commit
message that unlike "git clone" this is known not to clean after itself
properly on ctrl+c?
Ævar Arnfjörð Bjarmason Sept. 13, 2021, 2:20 p.m. UTC | #2
On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote:

> On Wed, Sep 08 2021, Johannes Schindelin via GitGitGadget wrote:
>
>> Changes since v2:
>>
>>  * Adjusted the description of the list command in the manual page , as
>>    suggested by Bagas.
>>  * Addressed two style nits in cmd_run().
>>  * The documentation of git reconfigure -a was improved.
>>
>> Changes since v1:
>>
>>  * A couple typos were fixed
>>  * The code parsing the output of ls-remote was made more readable
>>  * The indentation used in scalar.txt now consistently uses tabs
>>  * We no longer hard-code core.bare = false when registering with Scalar
>
> In the summary I had on v1->v2 points 1-3 are for v2->v3, respectively,
> outstanding, addressed, outstanding:
>
>     https://lore.kernel.org/git/877dfupl7o.fsf@evledraar.gmail.com/
>
> In addition the discussion ending here:
> https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109082112270.55@tvgsbejvaqbjf.bet/
>
> For that point: I think it's fair enough not to properly handle the
> cleanup case in "scalar clone", but perhaps add a note in the commit
> message that unlike "git clone" this is known not to clean after itself
> properly on ctrl+c?

Seeing [1] about the planned re-roll I have the above a shot a few days
ago, see the original discussion at [2] (indirectly linked above).

The dependency graph isn't quite there yet, but I basically had it
working (some twiddling around adding a new top-level command name
needed). Result:
    
     .gitignore             |  1 +
     Documentation/Makefile |  4 ++++
     Makefile               | 27 +++++++++++++++++++++++++++
     3 files changed, 32 insertions(+)

And by comparison, your v3:
    
     Makefile                  |  8 +++++
     contrib/scalar/.gitignore |  5 +++
     contrib/scalar/Makefile   | 57 ++++++++++++++++++++++++++++++++++
     contrib/scalar/t/Makefile | 78 +++++++++++++++++++++++++++++++++++++++++++++++
     4 files changed, 148 insertions(+)

So that's a very pleasing reduction in complexity.

The WIP change for that is below, some oddities like the
s/scalarscalar/scalar/ (couldn't find the bit that generated that
built-in blurb at the time).

But for one the automatic lint integration in Documentation/ found one
nit, and "make test" etc. all work with this automatically.

All guarded behind a CONTRIB_SCALAR flag, so the end result isn't in any
way different.

Again, as pointed out in [2] the proposal isn't in any way to change
what an end user sees, just to make our build system less
complex. Stretching dependency graphs across Makefiles is a pain.

So just as an example I was improving the "sparse" and "hdr-check"
targets today, with this dropped into the main Makefile under a flag
stuff like that will Just Work, if it's under its own Makefile
infrastructure it'll need to be treated specially for every such change,
ditto "make TAGS" etc.

1. http://lore.kernel.org/git/nycvar.QRO.7.76.6.2109131531210.55@tvgsbejvaqbjf.bet;
2. https://lore.kernel.org/git/87mtoxwt63.fsf@evledraar.gmail.com/

diff --git a/.gitignore b/.gitignore
index 311841f9bed..491cb2177af 100644
--- a/.gitignore
+++ b/.gitignore
@@ -216,6 +216,7 @@
 /configure
 /.vscode/
 /tags
+/scalar
 /TAGS
 /cscope*
 /compile_commands.json
diff --git a/Documentation/Makefile b/Documentation/Makefile
index f5605b7767f..f0a03faf40f 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -19,6 +19,10 @@ MAN1_TXT += git.txt
 MAN1_TXT += gitk.txt
 MAN1_TXT += gitweb.txt
 
+ifdef CONTRIB_SCALAR
+MAN1_TXT += scalar.txt
+endif
+
 # man5 / man7 guides (note: new guides should also be added to command-list.txt)
 MAN5_TXT += gitattributes.txt
 MAN5_TXT += githooks.txt
diff --git a/Makefile b/Makefile
index 429c276058d..7407df45b2a 100644
--- a/Makefile
+++ b/Makefile
@@ -584,6 +584,7 @@ FUZZ_OBJS =
 FUZZ_PROGRAMS =
 GIT_OBJS =
 LIB_OBJS =
+NONGIT_PROGRAM_OBJS =
 OBJECTS =
 PROGRAM_OBJS =
 PROGRAMS =
@@ -691,10 +692,17 @@ PROGRAM_OBJS += shell.o
 .PHONY: program-objs
 program-objs: $(PROGRAM_OBJS)
 
+ifdef CONTRIB_SCALAR
+NONGIT_PROGRAM_OBJS += scalar.o
+endif
+.PHONY: nongit-program-objs
+nongit-program-objs: $(NONGIT_PROGRAM_OBJS)
+
 # Binary suffix, set to .exe for Windows builds
 X =
 
 PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS))
+PROGRAMS += $(patsubst %.o,%$X,$(NONGIT_PROGRAM_OBJS))
 
 TEST_BUILTINS_OBJS += test-advise.o
 TEST_BUILTINS_OBJS += test-bitmap.o
@@ -800,6 +808,9 @@ BINDIR_PROGRAMS_NEED_X += git-receive-pack
 BINDIR_PROGRAMS_NEED_X += git-shell
 BINDIR_PROGRAMS_NEED_X += git-upload-archive
 BINDIR_PROGRAMS_NEED_X += git-upload-pack
+ifdef CONTRIB_SCALAR
+BINDIR_PROGRAMS_NEED_X += scalar
+endif
 
 BINDIR_PROGRAMS_NO_X += git-cvsserver
 
@@ -1247,6 +1258,10 @@ else
 ALL_COMMANDS_TO_INSTALL += git-receive-pack$(X)
 ALL_COMMANDS_TO_INSTALL += git-upload-archive$(X)
 ALL_COMMANDS_TO_INSTALL += git-upload-pack$(X)
+
+ifdef CONTRIB_SCALAR
+ALL_COMMANDS_TO_INSTALL += scalar$(X)
+endif
 endif
 
 ALL_CFLAGS = $(DEVELOPER_CFLAGS) $(CPPFLAGS) $(CFLAGS)
@@ -2459,6 +2474,7 @@ git-objs: $(GIT_OBJS)
 
 OBJECTS += $(GIT_OBJS)
 OBJECTS += $(PROGRAM_OBJS)
+OBJECTS += $(NONGIT_PROGRAM_OBJS)
 OBJECTS += $(TEST_OBJS)
 OBJECTS += $(XDIFF_OBJS)
 OBJECTS += $(FUZZ_OBJS)
@@ -2582,6 +2598,9 @@ compat/nedmalloc/nedmalloc.sp compat/nedmalloc/nedmalloc.o: EXTRA_CPPFLAGS = \
 compat/nedmalloc/nedmalloc.sp: SP_EXTRA_FLAGS += -Wno-non-pointer-null
 endif
 
+ifdef CONTRIB_SCALAR
+# TODO: Implicit rule for git-scalar here
+endif
 git-%$X: %.o GIT-LDFLAGS $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
 
@@ -2596,6 +2615,11 @@ git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
 
+ifdef CONTRIB_SCALAR
+scalar: scalar.o $(GITLIBS)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
+endif
+
 $(REMOTE_CURL_ALIASES): $(REMOTE_CURL_PRIMARY)
 	$(QUIET_LNCP)$(RM) $@ && \
 	ln $< $@ 2>/dev/null || \
@@ -2800,6 +2824,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo SHELL_PATH=\''$(subst ','\'',$(SHELL_PATH_SQ))'\' >$@+
 	@echo TEST_SHELL_PATH=\''$(subst ','\'',$(TEST_SHELL_PATH_SQ))'\' >>$@+
 	@echo PERL_PATH=\''$(subst ','\'',$(PERL_PATH_SQ))'\' >>$@+
+	@echo CONTRIB_SCALAR=\''$(subst ','\'',$(subst ','\'',$(CONTRIB_SCALAR)))'\' >>$@+
 	@echo DIFF=\''$(subst ','\'',$(subst ','\'',$(DIFF)))'\' >>$@+
 	@echo PYTHON_PATH=\''$(subst ','\'',$(PYTHON_PATH_SQ))'\' >>$@+
 	@echo TAR=\''$(subst ','\'',$(subst ','\'',$(TAR)))'\' >>$@+
@@ -2881,6 +2906,8 @@ bin-wrappers/%: wrap-for-bin.sh
 	$(QUIET_GEN)sed -e '1s|#!.*/sh|#!$(SHELL_PATH_SQ)|' \
 	     -e 's|@@BUILD_DIR@@|$(shell pwd)|' \
 	     -e 's|@@PROG@@|$(patsubst test-%,t/helper/test-%$(X),$(@F))$(patsubst git%,$(X),$(filter $(@F),$(BINDIR_PROGRAMS_NEED_X)))|' < $< > $@ && \
+	sed -e 's|scalarscalar|scalar|' <$@ >$@+ && \
+	mv $@+ $@ && \
 	chmod +x $@
Johannes Schindelin Sept. 13, 2021, 8:53 p.m. UTC | #3
Hi Ævar,

On Mon, 13 Sep 2021, Ævar Arnfjörð Bjarmason wrote:

>
> On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote:
>
> > On Wed, Sep 08 2021, Johannes Schindelin via GitGitGadget wrote:
> >
> >> Changes since v2:
> >>
> >>  * Adjusted the description of the list command in the manual page , as
> >>    suggested by Bagas.
> >>  * Addressed two style nits in cmd_run().
> >>  * The documentation of git reconfigure -a was improved.
> >>
> >> Changes since v1:
> >>
> >>  * A couple typos were fixed
> >>  * The code parsing the output of ls-remote was made more readable
> >>  * The indentation used in scalar.txt now consistently uses tabs
> >>  * We no longer hard-code core.bare = false when registering with Scalar
> >
> > In the summary I had on v1->v2 points 1-3 are for v2->v3, respectively,
> > outstanding, addressed, outstanding:
> >
> >     https://lore.kernel.org/git/877dfupl7o.fsf@evledraar.gmail.com/
> >
> > In addition the discussion ending here:
> > https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109082112270.55@tvgsbejvaqbjf.bet/
> >
> > For that point: I think it's fair enough not to properly handle the
> > cleanup case in "scalar clone", but perhaps add a note in the commit
> > message that unlike "git clone" this is known not to clean after itself
> > properly on ctrl+c?
>
> Seeing [1] about the planned re-roll I have the above a shot a few days
> ago, see the original discussion at [2] (indirectly linked above).

There is a good reason why I did not engage in that tangent about
deviating from the established `contrib/*/Makefile` paradigm: I find it
particularly unrelated to what this here patch series is trying to
accomplish, and I cannot bring myself to be interested in the proposed
build system changes, either, because I do not see any benefit in the
changes, only downsides.

I find the distraction unnecessary.

Besides, the way I designed it, the code in `contrib/scalar/` intrudes as
little as possible on the core Git build system. The impact on the
top-level `Makefile` is quite minimal, which is just the way I firmly
believe it should be.

In short: I do not want those intrusive changes to the top-level
`Makefile`, not in this patch series, and not as a follow-up, either.

We have much bigger fries to fry: namely, how to migrate the improvements
for large-scale operations from Scalar to core Git, so that all Git users
can benefit. Granted, it will take a lot effort, and it would be easier to
move around `Makefile` rules instead. But ultimately, the benefit of
allowing users to handle larger repositories with ease will be worth that
effort.

Ciao,
Johannes
Ævar Arnfjörð Bjarmason Sept. 14, 2021, 10:59 a.m. UTC | #4
On Mon, Sep 13 2021, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Mon, 13 Sep 2021, Ævar Arnfjörð Bjarmason wrote:
>
>>
>> On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote:
>>
>> > On Wed, Sep 08 2021, Johannes Schindelin via GitGitGadget wrote:
>> >
>> >> Changes since v2:
>> >>
>> >>  * Adjusted the description of the list command in the manual page , as
>> >>    suggested by Bagas.
>> >>  * Addressed two style nits in cmd_run().
>> >>  * The documentation of git reconfigure -a was improved.
>> >>
>> >> Changes since v1:
>> >>
>> >>  * A couple typos were fixed
>> >>  * The code parsing the output of ls-remote was made more readable
>> >>  * The indentation used in scalar.txt now consistently uses tabs
>> >>  * We no longer hard-code core.bare = false when registering with Scalar
>> >
>> > In the summary I had on v1->v2 points 1-3 are for v2->v3, respectively,
>> > outstanding, addressed, outstanding:
>> >
>> >     https://lore.kernel.org/git/877dfupl7o.fsf@evledraar.gmail.com/
>> >
>> > In addition the discussion ending here:
>> > https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109082112270.55@tvgsbejvaqbjf.bet/
>> >
>> > For that point: I think it's fair enough not to properly handle the
>> > cleanup case in "scalar clone", but perhaps add a note in the commit
>> > message that unlike "git clone" this is known not to clean after itself
>> > properly on ctrl+c?
>>
>> Seeing [1] about the planned re-roll I have the above a shot a few days
>> ago, see the original discussion at [2] (indirectly linked above).
>
> There is a good reason why I did not engage in that tangent about
> deviating from the established `contrib/*/Makefile` paradigm: I find it
> particularly unrelated to what this here patch series is trying to
> accomplish, and I cannot bring myself to be interested in the proposed
> build system changes, either, because I do not see any benefit in the
> changes, only downsides.
>
> I find the distraction unnecessary.

Perhaps I'm reading too much between the lines here, so forgive any
undue knee-jerk reaction.

But aside from any technical disagreement we may have I find this way of
handling reviews quite disrespectful of other people's time.

Sure, maybe I'm wrong, and maybe you either don't see any value in the
proposed changes or maybe they're just bad suggestions.

I still took the time to review and comment on a series you're
submitting. I think the least you can do is to include some comment in a
re-roll like:

    Skimmed Ævar's proposal about an alternate build system
    implementation, sorry, I just don't think it's worth it, going to
    not change anything there.

Which would be fair enough, and would leave the ball in my court in
terms of either dropping it, submitting any patch on top etc.

As opposed to just ignoring that whole thread, leaving both me wondering
if it's even been seen (and sending a few reminders like the linked
upthread), as well as others trying to track the state of the series.

> Besides, the way I designed it, the code in `contrib/scalar/` intrudes as
> little as possible on the core Git build system. The impact on the
> top-level `Makefile` is quite minimal, which is just the way I firmly
> believe it should be.
>
> In short: I do not want those intrusive changes to the top-level
> `Makefile`, not in this patch series, and not as a follow-up, either.

Communication grievances aside:

 * Saying that we have an "established `contrib/*/Makefile` paradigm"
   doesn't really follow in this case. Most of that isn't C code, and
   the bits that are C code are not using libgit.a.

   And as I've argued elsewhere I think that whole pattern was a mistake
   in the first place, it makes inter-Makefile dependencies a pain to
   manage, has resulted in bitrot of things like git-subtree.sh and
   mw-to-git, all because we conflate whether we want to build/test
   things with what we'd like in a default installation.

 * Because of that any number of targets / workflows in the Makefile
   aren't going to work by default, e.g. try checking it out and doing
   "make TAGS".

   That specific one happens to because we exclude contrib explicitly,
   that could be fixed, but there's going to be any number of things
   like that, but current and future ones.

 * One target that seems missing (maybe I've somehow missed it) is any
   support for installing the build command, its docs etc.

 * I hacked a bit more on this today and came up with a not-quite-ready
   change that both for the Makefile and Documentation/Makefile will
   build scalar.c like any other top-level command, and we'll always get
   a bin-wrapper/scalar, we'll only do something different at "install"
   time.

   This means that just like the "all:: $(FUZZ_OBJS)" we'll always build
   a scalar.o, so that'll prevent others from e.g. breaking a library
   function you rely on (which'll only be annoying caught in CI, or
   integration or whatever).

 * I think it's fair to say that whatever one thinks of my argument,
   your [1] leaves the reader hanging about *why* you went for this
   "make -C contrib/scalar/Makefile".

   As summarized there you tried to have it completely separate, but
   failed.

   So now there's a bit of integration in the top-level Makefile
   already. But why try in the first place? Just slavish conformance to
   existing "contrib/" convention?

   The dependency back on assets in subdirs there is deep, so surely
   it's not to build this independently somehow.

> We have much bigger fries to fry: namely, how to migrate the improvements
> for large-scale operations from Scalar to core Git, so that all Git users
> can benefit. Granted, it will take a lot effort, and it would be easier to
> move around `Makefile` rules instead. But ultimately, the benefit of
> allowing users to handle larger repositories with ease will be worth that
> effort.

It's your implementation that requires mostly moving around / copying
Makefile rules, if you piggy-back on the existing Makefile rules you'll
get most of the behavior you've duplicated for free without any moving
or copying.

But in any case, re the last bullet point above and your "[...]and not
as a follow-up, either" comments it's not clear whether you're saying
that you don't have time to work on this, or that you wouldn't want it
at all in any shape or form, even if someone else did it.

Which is not to say that I'm promising to do so even if that's the case,
I do think the onus is on the person proposing the change, and to take
productive feedback about things that are introducing unnecessary
complexity, and won't saddle others with undue technical debt going
forward.

1. https://lore.kernel.org/git/b8c7d3f84508ae0fb300f47c726764f4cbf46be9.1631129086.git.gitgitgadget@gmail.com/
Johannes Schindelin Sept. 14, 2021, 2:24 p.m. UTC | #5
Hi Ævar,

On Tue, 14 Sep 2021, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Sep 13 2021, Johannes Schindelin wrote:
>
> > On Mon, 13 Sep 2021, Ævar Arnfjörð Bjarmason wrote:
> >
> >> On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote:
> >>
> >> > In the summary I had on v1->v2 points 1-3 are for v2->v3,
> >> > respectively, outstanding, addressed, outstanding:
> >> >
> >> >     https://lore.kernel.org/git/877dfupl7o.fsf@evledraar.gmail.com/
> >> >
> >> > In addition the discussion ending here:
> >> > https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109082112270.55@tvgsbejvaqbjf.bet/
> >> >
> >> > For that point: I think it's fair enough not to properly handle the
> >> > cleanup case in "scalar clone", but perhaps add a note in the
> >> > commit message that unlike "git clone" this is known not to clean
> >> > after itself properly on ctrl+c?
> >>
> >> Seeing [1] about the planned re-roll I have the above a shot a few
> >> days ago, see the original discussion at [2] (indirectly linked
> >> above).
> >
> > There is a good reason why I did not engage in that tangent about
> > deviating from the established `contrib/*/Makefile` paradigm: I find
> > it particularly unrelated to what this here patch series is trying to
> > accomplish, and I cannot bring myself to be interested in the proposed
> > build system changes, either, because I do not see any benefit in the
> > changes, only downsides.
> >
> > I find the distraction unnecessary.
>
> Perhaps I'm reading too much between the lines here, so forgive any
> undue knee-jerk reaction.

Okay, let's try an analogy.

Imagine that a person is asking for directions to the train station. And
the other person is replying by asking "did you know that this train
station was built in 1878? It is actually quite interesting a story...
[and then goes on to describe the history and what excites them about
it]". Now, the first person tries again to ask for directions, again does
not get an answer to that question, and is slowly starting to look at
their watch. The second person, being completely oblivious to all of this,
goes on with their wonderful story about the train station and its
cultural heritage. So the first person walks a bit further to ask a third
person, but the second person is not done yet and says "but you haven't
heard me out! That's disrespectful!".

Just imagine for a minute how you would feel if you were the first person.

And that is how I feel asking for reviews about the Scalar patch series
and then being forcefully dragged into that tangent about the build
process.

I find the well-established paradigm to keep contrib/'s build procedures
as confined to their own directory as possible the most reasonable way to
handle the build by virtue of _not_ polluting the top-level Makefile
unnecessarily. All of your objections strike me simply as personal
viewpoints, not as technical arguments, and they fail to address this
"pollution of the top-level Makefile" problem. I therefore strongly
disagree with your suggestion that the build system should be changed, I
would even argue that your suggestion should been dismissed on purely
technical grounds, and I wish you hadn't forced me to say this as
forcefully.

And even if I looked more favorably on your suggestion to change the build
procedure, I find this distraction about the build as little constructive
as the explanations about the train station's history above. Those
suggestions do succeed in derailing the conversation about how Git could
scale better, how Scalar _does_ teach Git how to scale better, and about
how to teach Git itself more and more of Scalar's tricks.

If you have ideas how to teach, say, `git clone` to perform a couple of
Scalar's tricks, by all means, let's hear them, or even better, let's see
those patches. If you want to change the build system, still, I cannot
stop you from sending patches to that end to the Git mailing list, but
please expect me to be uninterested in them in any way, and to prefer to
spend my efforts to improve Git elsewhere. If you have other ideas how to
improve on Scalar in a user-perceptible way, however, I am all ears again.

I hope this clarifies it, without the need to read between the lines,
Johannes
Junio C Hamano Sept. 14, 2021, 5:29 p.m. UTC | #6
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Okay, let's try an analogy.
>
> Imagine that a person is asking for directions to the train station. And
> the other person is replying by asking "did you know that this train
> station was built in 1878? It is actually quite interesting a story...
> [and then goes on to describe the history and what excites them about
> it]". Now, the first person tries again to ask for directions, again does
> not get an answer to that question, and is slowly starting to look at
> their watch. The second person, being completely oblivious to all of this,
> goes on with their wonderful story about the train station and its
> cultural heritage. So the first person walks a bit further to ask a third
> person, but the second person is not done yet and says "but you haven't
> heard me out! That's disrespectful!".
>
> Just imagine for a minute how you would feel if you were the first person.
>
> And that is how I feel asking for reviews about the Scalar patch series
> and then being forcefully dragged into that tangent about the build
> process.

At least to me, how this Makefile for Scalar should interact with
the overall build process does not mesh well with the story about
hwo direction to and history of the station are unrelated.  If we
plan to start from contrib/ and eventually want to make it a part
of the core Git (i.e. "git scalar <subcmd> ..." becomes just like
"git bisect <subcmd> ..."), we would eventually need to see the
recipe needed for including "bisect" and "scalar" work the same
way, no?

I am getting the impression that such a unified build process is
Ævar wants to see at the end, I am not even sure if you do from
the above "analogy".  Cool down a bit, perhaps?

The following assumes that you share the goal of making "git
scalar" just like "git bisect"---another first class citizen of
Git toolbox, the user can choose to use it or the user may not
have a need to interact with it, but it exists there by default
and is not an opt-in add-on component.

I would understand it if your plan is to convert to a unified
build procedure at the very end of the upstreaming process, and
not while you populate contrib/ with more and more scalar stuff,
because the Makefile bits for the entire scalar, while not yet
upstreamed, has already been written as a separate procedure and
having to convert the whole thing upfront before you can start
trickle parts would mean you need to (re)start the process.  And
I would even be sympathetic if you felt it like a distraction.

But at least I view it as a step that needs to happen sometime
between now and at the end.  I do not yet have an opinion on
which one is more pleasant, between (1) having to deal with a
single Makefile that needs to be aware of two different locations
*.[ch] lives in, and (2) having to deal with two Makefiles that
duplicates definitions and risks them needlessly diverging.

I also would understand it if the reason why you want to keep the
top-level Makefile as intact as possible because you sense a high
probability that scalar will stay in contrib/ and even turn out
to be a failure.  Keeping the build procedure separated certainly
will keep it easier to yank it out later.  But I do not think
such a case is quite likely.

Thanks.
Ævar Arnfjörð Bjarmason Sept. 14, 2021, 6:09 p.m. UTC | #7
On Tue, Sep 14 2021, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> Okay, let's try an analogy.
>>
>> Imagine that a person is asking for directions to the train station. And
>> the other person is replying by asking "did you know that this train
>> station was built in 1878? It is actually quite interesting a story...
>> [and then goes on to describe the history and what excites them about
>> it]". Now, the first person tries again to ask for directions, again does
>> not get an answer to that question, and is slowly starting to look at
>> their watch. The second person, being completely oblivious to all of this,
>> goes on with their wonderful story about the train station and its
>> cultural heritage. So the first person walks a bit further to ask a third
>> person, but the second person is not done yet and says "but you haven't
>> heard me out! That's disrespectful!".
>>
>> Just imagine for a minute how you would feel if you were the first person.
>>
>> And that is how I feel asking for reviews about the Scalar patch series
>> and then being forcefully dragged into that tangent about the build
>> process.
>
> At least to me, how this Makefile for Scalar should interact with
> the overall build process does not mesh well with the story about
> hwo direction to and history of the station are unrelated.  If we
> plan to start from contrib/ and eventually want to make it a part
> of the core Git (i.e. "git scalar <subcmd> ..." becomes just like
> "git bisect <subcmd> ..."), we would eventually need to see the
> recipe needed for including "bisect" and "scalar" work the same
> way, no?
>
> I am getting the impression that such a unified build process is
> Ævar wants to see at the end, I am not even sure if you do from
> the above "analogy".  Cool down a bit, perhaps?
>
> The following assumes that you share the goal of making "git
> scalar" just like "git bisect"---another first class citizen of
> Git toolbox, the user can choose to use it or the user may not
> have a need to interact with it, but it exists there by default
> and is not an opt-in add-on component.
>
> I would understand it if your plan is to convert to a unified
> build procedure at the very end of the upstreaming process, and
> not while you populate contrib/ with more and more scalar stuff,
> because the Makefile bits for the entire scalar, while not yet
> upstreamed, has already been written as a separate procedure and
> having to convert the whole thing upfront before you can start
> trickle parts would mean you need to (re)start the process.  And
> I would even be sympathetic if you felt it like a distraction.
>
> But at least I view it as a step that needs to happen sometime
> between now and at the end.  I do not yet have an opinion on
> which one is more pleasant, between (1) having to deal with a
> single Makefile that needs to be aware of two different locations
> *.[ch] lives in, and (2) having to deal with two Makefiles that
> duplicates definitions and risks them needlessly diverging.

For what it's worth what I had on top of this is not (1) or (2), but a
(0): I.e. there isn't a contrib/scalar anymore, I moved:

    contrib/scalar/scalar.c -> scalar
    contrib/scalar/scalar.txt -> Documentation/scalar.txt
    contrib/scalar/t9099-scalar.sh -> t/t9099-scalar.sh

We build, test, and otherwise check (e.g. "make check-docs") it by
default, what we don't do is install it unless you ask. You need to run:

    # Or any other install* target
    make install install-doc INSTALL_SCALAR=YesPlease

It could be be kept in contrib/scalar/ even with that sort of approach,
and it would still be simpler than the two-Makefile approach.

But just moving the code, tests and documentation where everything else
lives cuts down an all sorts of special cases, file globs in various
places (e.g. doc lints) will just work and won't need adjustment.

> I also would understand it if the reason why you want to keep the
> top-level Makefile as intact as possible because you sense a high
> probability that scalar will stay in contrib/ and even turn out
> to be a failure.  Keeping the build procedure separated certainly
> will keep it easier to yank it out later.  But I do not think
> such a case is quite likely.

For what it's worth the WIP patch(es) I have on top of it will probably
make such a thing even easier, not that removing it from the tree would
be much of a problem in either case. It's mostly a few lines added to
lists in various places in the Makfile.

If I were to clean this up properly most of the changes would be
teaching the Makefile that it can build N number of named top-level
"special" commands that get dropped into bin/, not just the "git" we
hardcode now.

If you're interested I could try to clean that up and send something on
top, but given the tense-ness of the discussion & unrelated but relevant
patch queue I've got outstanding wouldn't do so otherwise...
Ævar Arnfjörð Bjarmason Sept. 14, 2021, 6:25 p.m. UTC | #8
On Tue, Sep 14 2021, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Tue, 14 Sep 2021, Ævar Arnfjörð Bjarmason wrote:
>
>> On Mon, Sep 13 2021, Johannes Schindelin wrote:
>>
>> > On Mon, 13 Sep 2021, Ævar Arnfjörð Bjarmason wrote:
>> >
>> >> On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote:
>> >>
>> >> > In the summary I had on v1->v2 points 1-3 are for v2->v3,
>> >> > respectively, outstanding, addressed, outstanding:
>> >> >
>> >> >     https://lore.kernel.org/git/877dfupl7o.fsf@evledraar.gmail.com/
>> >> >
>> >> > In addition the discussion ending here:
>> >> > https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109082112270.55@tvgsbejvaqbjf.bet/
>> >> >
>> >> > For that point: I think it's fair enough not to properly handle the
>> >> > cleanup case in "scalar clone", but perhaps add a note in the
>> >> > commit message that unlike "git clone" this is known not to clean
>> >> > after itself properly on ctrl+c?
>> >>
>> >> Seeing [1] about the planned re-roll I have the above a shot a few
>> >> days ago, see the original discussion at [2] (indirectly linked
>> >> above).
>> >
>> > There is a good reason why I did not engage in that tangent about
>> > deviating from the established `contrib/*/Makefile` paradigm: I find
>> > it particularly unrelated to what this here patch series is trying to
>> > accomplish, and I cannot bring myself to be interested in the proposed
>> > build system changes, either, because I do not see any benefit in the
>> > changes, only downsides.
>> >
>> > I find the distraction unnecessary.
>>
>> Perhaps I'm reading too much between the lines here, so forgive any
>> undue knee-jerk reaction.
>
> Okay, let's try an analogy.
>
> Imagine that a person is asking for directions to the train station. And
> the other person is replying by asking "did you know that this train
> station was built in 1878? It is actually quite interesting a story...
> [and then goes on to describe the history and what excites them about
> it]". Now, the first person tries again to ask for directions, again does
> not get an answer to that question, and is slowly starting to look at
> their watch. The second person, being completely oblivious to all of this,
> goes on with their wonderful story about the train station and its
> cultural heritage. So the first person walks a bit further to ask a third
> person, but the second person is not done yet and says "but you haven't
> heard me out! That's disrespectful!".

To be clear that's not what I said or meant. I wasn't saying you had to
exhaustively hear out some argument or participate in a discussion
you're not interested in.

I am saying that if you're soliciting feedback and you get some, and the
person giving you the feedback sends you pings back, as I did here:

    https://lore.kernel.org/git/871r6axban.fsf@evledraar.gmail.com/
    https://lore.kernel.org/git/87mtoxwt63.fsf@evledraar.gmail.com/
    https://lore.kernel.org/git/877dfupl7o.fsf@evledraar.gmail.com/
    https://lore.kernel.org/git/87r1dydp4m.fsf@evledraar.gmail.com/

That it would be nice to at least reply with some brief comment to the
effect that you're not personally interested in improving this area.

Now, shortly after you send this you re-rolled the v3
(https://lore.kernel.org/git/pull.1005.v4.git.1631630356.gitgitgadget@gmail.com/)
with a note of:

 * Removed the [RFC] prefix because I did not hear any objections against
   putting this into contrib/.

I don't know if you wrote that after this reply, or really didn't see
any of above, but that's a really inaccurate/misleading comment
considering that context.

> Just imagine for a minute how you would feel if you were the first person.
>
> And that is how I feel asking for reviews about the Scalar patch series
> and then being forcefully dragged into that tangent about the build
> process.
>
> I find the well-established paradigm to keep contrib/'s build procedures
> as confined to their own directory as possible the most reasonable way to
> handle the build by virtue of _not_ polluting the top-level Makefile
> unnecessarily. All of your objections strike me simply as personal
> viewpoints, not as technical arguments, and they fail to address this
> "pollution of the top-level Makefile" problem. I therefore strongly
> disagree with your suggestion that the build system should be changed, I
> would even argue that your suggestion should been dismissed on purely
> technical grounds, and I wish you hadn't forced me to say this as
> forcefully.
>
> And even if I looked more favorably on your suggestion to change the build
> procedure, I find this distraction about the build as little constructive
> as the explanations about the train station's history above.

If we're indulging each other, here's my version of that train analogy:

We're in the business of selling a sugary drink that comes in a red can.

We've got a big factory with an attached train station, and all the
trains that move our product are also red.

Now, some of our customers want the new purple colored sugary drink
we've cooked up. A new purple factory's all set up for making them.

But for some reason the part of the business and distribution plans call
for building a new purple train station parallel to our existing one.

All purple sugary drinks are expected to be moved on purple trains, all
our conductors will need some slight re-training and switchover time to
flip-flop between red and purple trains. Issues with purple production
floors and purple workflows will need to be ironed out.

We did a survey of our customers and most of them weren't even aware
that there was such a thing as train. Or perhaps they've seen other
trains, but most haven't seen our trains.

Trains occasionally need field servicing, luckily our fleet of red
trains is set up to carry its own spare parts. Purple trains can only be
serviced with the assistance of a red train.

A product survey asking customers whether their enjoyment of the red or
purple sugary drinks might be impacted by the color of the train they
were shipped on only resulted in puzzled blank looks from the
participants.

The current state of the purple train station and its fleet of purple
trains is that it somewhat works with some hiccups. The currently built
purple train station omitted any sort of train track for leaving the
station though. It should be easy to add one, but...

The manager of the purple train project has been asked whether we can't
just have our red machines in our red factory make the purple sugary
drink, which we can then load on our fleet of red trains. Why do we need
a new parallel rail system when we really care about customers drinking
our delicious sugary drinks?

In case it's not obvious:

   {red,purple} sugary drink  = /usr/bin/git & /usr/bin/scalar
   {red,purple} train station = ./Makefile & ./contrib/scalar/Makefile
   train track for leaving the station = make install (AFAICT your
                                         current patches have no
                                         installation mechanism)

> Those
> suggestions do succeed in derailing the conversation about how Git could
> scale better, how Scalar _does_ teach Git how to scale better, and about
> how to teach Git itself more and more of Scalar's tricks.
>
> If you have ideas how to teach, say, `git clone` to perform a couple of
> Scalar's tricks, by all means, let's hear them, or even better, let's see
> those patches. If you want to change the build system, still, I cannot
> stop you from sending patches to that end to the Git mailing list, but
> please expect me to be uninterested in them in any way, and to prefer to
> spend my efforts to improve Git elsewhere. If you have other ideas how to
> improve on Scalar in a user-perceptible way, however, I am all ears again.
>
> I hope this clarifies it, without the need to read between the lines,
> Johannes

Whatever the end goal of the patches you're sending part of them is
proposing a given approach to go from A to B.

I find it odd to be claiming that the end-state is so important that we
can't spare time to discuss some of the implementation
details.

Particularly in this case, where I've sent a mostly working patch-on-top
which I think should be clear from context I'd be willing to finish up,
so it's not like it's just pointless nitpicking with no end-goal of a
code improvement in sight.
Derrick Stolee Sept. 14, 2021, 8:35 p.m. UTC | #9
On 9/14/2021 2:09 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Sep 14 2021, Junio C Hamano wrote:
> 
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>
>> At least to me, how this Makefile for Scalar should interact with
>> the overall build process does not mesh well with the story about
>> hwo direction to and history of the station are unrelated.  If we
>> plan to start from contrib/ and eventually want to make it a part
>> of the core Git (i.e. "git scalar <subcmd> ..." becomes just like
>> "git bisect <subcmd> ..."), we would eventually need to see the
>> recipe needed for including "bisect" and "scalar" work the same
>> way, no?

We should definitely work to find a better way to describe our
vision for how _the ideas in Scalar_ can be adopted into Git proper.

Before this series, we were adding functionality to Git that allowed
Scalar to simplify to just a CLI that configures Git features. This
submission allows that CLI to be available via an opt-in compile flag.
This should allow more users to try out the ideas and perhaps we find
the things that really work for people (and almost more importantly,
the ideas that are _too_ opinionated).

But the way I see Scalar being fully incorporated into Git is not as
a "git scalar <foo>" command or even having "scalar" be included by
default. Instead, perhaps a new builtin would need to be created and
its CLI would need to be presented and reviewed with significant
attention to long-term support in the Git project. Having Scalar as
a testing ground for these ideas seems like a positive way forward.

This is a big reason why we think that contrib/ is a good place for
it to exist.

>> I am getting the impression that such a unified build process is
>> Ævar wants to see at the end, I am not even sure if you do from
>> the above "analogy".  Cool down a bit, perhaps?

I agree that the temperature of this thread has gotten a bit
heated. I think there is something valuable to be gained from
each perspective, but not in a way that either has presented it.

>> The following assumes that you share the goal of making "git
>> scalar" just like "git bisect"---another first class citizen of
>> Git toolbox, the user can choose to use it or the user may not
>> have a need to interact with it, but it exists there by default
>> and is not an opt-in add-on component.
>>
>> I would understand it if your plan is to convert to a unified
>> build procedure at the very end of the upstreaming process, and
>> not while you populate contrib/ with more and more scalar stuff,
>> because the Makefile bits for the entire scalar, while not yet
>> upstreamed, has already been written as a separate procedure and
>> having to convert the whole thing upfront before you can start
>> trickle parts would mean you need to (re)start the process.  And
>> I would even be sympathetic if you felt it like a distraction.
>>
>> But at least I view it as a step that needs to happen sometime
>> between now and at the end.  I do not yet have an opinion on
>> which one is more pleasant, between (1) having to deal with a
>> single Makefile that needs to be aware of two different locations
>> *.[ch] lives in, and (2) having to deal with two Makefiles that
>> duplicates definitions and risks them needlessly diverging.

Since we already need to modify the root Makefile, I think having
the root Makefile add the files from contrib/scalar from an
optional flag is a great way to reduce duplication across multiple
Makefiles while also maintaining the Scalar is compiled optionally.

One big goal is to minimize how often we need to update Scalar. I
can see things like adjusting the recommended config once per
release cycle based on which new features are available. I don't
really want to be spending time updating the Makefile to match a
contribution that was already carefully reviewed and tested. I
also don't want to put the burden of updating contrib/scalar upon
those contributors.

> For what it's worth what I had on top of this is not (1) or (2), but a
> (0): I.e. there isn't a contrib/scalar anymore, I moved:
> 
>     contrib/scalar/scalar.c -> scalar>     contrib/scalar/scalar.txt -> Documentation/scalar.txt
>     contrib/scalar/t9099-scalar.sh -> t/t9099-scalar.sh
> 
> We build, test, and otherwise check (e.g. "make check-docs") it by
> default, what we don't do is install it unless you ask. You need to run:
> 
>     # Or any other install* target
>     make install install-doc INSTALL_SCALAR=YesPlease
> 
> It could be be kept in contrib/scalar/ even with that sort of approach,
> and it would still be simpler than the two-Makefile approach.

I think keeping it in contrib/scalar is best for now. But I do
agree that a single Makefile has benefits.

One early suggestion from a while back was to modify git.c to
handle the "scalar" executable as well as the "git" executable,
specifically to reduce duplication handling options such as

  -c config.key=value
  -C worktree
  --exec-path

and similar commands. While our duplication of the "-c" option
does add similar code in a second place, these other options
are less critical for Scalar, especially in its current version.
I think refactoring the code in git.c to cater to the "scalar"
executable is at least premature. If we want to pursue these
other options in the future, then that refactoring could happen
as a separate discussion after the rest of the build system and
CLI have been figured out.

_Perhaps_ Johannes still had that level of integration in his
head when responding to the single-Makefile recommendations.

> But just moving the code, tests and documentation where everything else
> lives cuts down an all sorts of special cases, file globs in various
> places (e.g. doc lints) will just work and won't need adjustment.
> 
>> I also would understand it if the reason why you want to keep the
>> top-level Makefile as intact as possible because you sense a high
>> probability that scalar will stay in contrib/ and even turn out
>> to be a failure.  Keeping the build procedure separated certainly
>> will keep it easier to yank it out later.  But I do not think
>> such a case is quite likely.
> 
> For what it's worth the WIP patch(es) I have on top of it will probably
> make such a thing even easier, not that removing it from the tree would
> be much of a problem in either case. It's mostly a few lines added to
> lists in various places in the Makfile.

Do you have a version of these patches available for adaptation
into this series? I'd like to take a look and see what it would
look like to squash them into this series. Forgive me if I just
missed the link. (I see the diff you posted earlier in this thread.)

> If I were to clean this up properly most of the changes would be
> teaching the Makefile that it can build N number of named top-level
> "special" commands that get dropped into bin/, not just the "git" we
> hardcode now.

This is an interesting idea for revamping how adjacent tools are
compiled and shipped with Git from contrib/ (or possibly elsewhere
if we decided to start including more things as "blessed helpers".

As a complete aside: I'm interested in using the sparse-checkout
feature as I work on the Git codebase, just to make sure I hit
pain points before any other user.

This is the best that I could do for my purposes:

$ git sparse-checkout list
.github
Documentation
builtin
compat
contrib/scalar
ewah
git-gui
gitk-git
gitk-gui
gitweb
mergetools
negotiator
perl
po
refs
sha1dc
sha256
t
templates
trace2
xdiff

And 'git status' reports that this includes 97% of the tracked
files. Perhaps there are ways to make this be smaller by having
make skip building things like git-gui if the directory doesn't
exist. Another idea would be to skip any logic around translating
messages if the 'po' directory is missing.

The reason I bring this up is that I'm interested in finding
ways to make our build system be streamlined a bit using the
presence of directories as a way to opt in/out of certain build
outputs. Since Scalar is being added as a new component, this is
a good opportunity to establish a pattern that works for this
effort, too.

Thanks,
-Stolee
Junio C Hamano Sept. 14, 2021, 9:49 p.m. UTC | #10
Junio C Hamano <gitster@pobox.com> writes:

> But at least I view it as a step that needs to happen sometime
> between now and at the end.  I do not yet have an opinion on
> which one is more pleasant, between (1) having to deal with a
> single Makefile that needs to be aware of two different locations
> *.[ch] lives in, and (2) having to deal with two Makefiles that
> duplicates definitions and risks them needlessly diverging.

FWIW, I am leaning towards the latter, with the assumption that this
may take more than a cycle to cook in contrib/.

Adding the Makefile bits to the top-level and keeping the topic in
'next' means all the topics that pass the pipeline will need to be
written in such a way that its Makefile change works well with and
without the unified Makefile bits from this topic, an additional
burden on other topics this topic would impose.

So it is understandable to want to keep the changes to the top-level
Makefile to the minimum, even if it may mean that it requires more
effort in the end to clean things up when the topic graduates.

An alternative would be to bypass the contrib/ phase and start as a
new subcommand that is first-class citizen from day one and let it
spend as much time as it needs to mature.  It would burden the
topics that pass the pipeline while this is cooking the same way as
having a unified build procedure in the top-level Makefile, of
course, though.
Theodore Ts'o Sept. 14, 2021, 11:22 p.m. UTC | #11
On Tue, Sep 14, 2021 at 04:35:42PM -0400, Derrick Stolee wrote:
> 
> We should definitely work to find a better way to describe our
> vision for how _the ideas in Scalar_ can be adopted into Git proper.
> ..
> 
> But the way I see Scalar being fully incorporated into Git is not as
> a "git scalar <foo>" command or even having "scalar" be included by
> default.

I agree 1000%.  It may be convenient for existing scalar users to have
a way they can stick with the CLI that they are used to, but before we
add this functionality to git proper, let's please make sure git users
get a CLI which makes sense and not dictated by history.

We already have enough people who complain that the git interface
design is hard to understand but which can't be changed because we
bias our UI design in favor of existing users as opposed new users.
Given that I'm an existing user, I'm actually not complaining, but I
do recognize the validity of the complaints that for example, git
reset does three different, mostly unrelated things.

Given that the existing scalar users are used to "scalar <foo>" and
not "git scalar <foo">, I'd gently suggest that it's better that there
be an existing compatibility program which translates "scalar <foo>"
to whatever git command(s) makes sense, as opposed to optimizing for
the simplicity of said program so that all forms of "scalar <foo>"
should get translated to "git scalar <foo>".

So for example, if we are inside a mono repro, it would seem to me
that "git commit" should automatically do the right thing, as opposed
to imposing cognitive load on the user to know when they are supposed
to type "git commit ..." versus "git scalar commit ..." versus "scalar
commit".

Please, let's design the UI for scalar integration into git with deep
sympathy for the users, and *not* the convenience of the compatibility
script for the existing scalar CLI users.

						- Ted
Ævar Arnfjörð Bjarmason Sept. 15, 2021, 5:51 p.m. UTC | #12
On Tue, Sep 14 2021, Derrick Stolee wrote:

> On 9/14/2021 2:09 PM, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Tue, Sep 14 2021, Junio C Hamano wrote:
>> 
>>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>>
>>> At least to me, how this Makefile for Scalar should interact with
>>> the overall build process does not mesh well with the story about
>>> hwo direction to and history of the station are unrelated.  If we
>>> plan to start from contrib/ and eventually want to make it a part
>>> of the core Git (i.e. "git scalar <subcmd> ..." becomes just like
>>> "git bisect <subcmd> ..."), we would eventually need to see the
>>> recipe needed for including "bisect" and "scalar" work the same
>>> way, no?
>
> We should definitely work to find a better way to describe our
> vision for how _the ideas in Scalar_ can be adopted into Git proper.
>
> Before this series, we were adding functionality to Git that allowed
> Scalar to simplify to just a CLI that configures Git features. This
> submission allows that CLI to be available via an opt-in compile flag.
> This should allow more users to try out the ideas and perhaps we find
> the things that really work for people (and almost more importantly,
> the ideas that are _too_ opinionated).

Yeah that makes sense. I think it looks like a useful command & it's
already useful to users.

I haven't been suggesting any changes to what gets installed here, FWIW
I think we could be even more aggressive on that front, e.g. shipping it
unconditionally in libexec, maybe with an optional switch for
/usr/bin/scalar, or to ship "scalar" symlinked to "git" and have it
route to the top-level scalar command depending on argv.

I dabbled in that a bit locally, FWIW it seems if anything even easier
to do than the approaches we've discussed so far, but I wanted to focus
on providing the same behavior in terms of build system maintenance.

> But the way I see Scalar being fully incorporated into Git is not as
> a "git scalar <foo>" command or even having "scalar" be included by
> default. Instead, perhaps a new builtin would need to be created and
> its CLI would need to be presented and reviewed with significant
> attention to long-term support in the Git project. Having Scalar as
> a testing ground for these ideas seems like a positive way forward.

*Nod*

> This is a big reason why we think that contrib/ is a good place for
> it to exist.

Here's where you and Johannes lose me. There's some rationale in your
minds for why sticking it in contrib is the obvious way to go. So far
you've been describing how it'll look to users etc, how it's arranged in
our source tree only matters to git.git developers.

I think the actual reason is to carve in advance some subjective
ownership/apartness or whatever for this thing, if that's the case I
think documentation/commit messages would also work.

I really don't care much if something lives in contrib or not in the
abstract, but various integration around builds in Makefile makes that
much easier in practice, and if "make install" looks the same...

>>> I am getting the impression that such a unified build process is
>>> Ævar wants to see at the end, I am not even sure if you do from
>>> the above "analogy".  Cool down a bit, perhaps?
>
> I agree that the temperature of this thread has gotten a bit
> heated. I think there is something valuable to be gained from
> each perspective, but not in a way that either has presented it.

Thanks, hopefully we can keep it jovial going forward. If you've got any
(either on-list or off-list) feedback about how I can improve my side of
that it would be most welcome.

>>> The following assumes that you share the goal of making "git
>>> scalar" just like "git bisect"---another first class citizen of
>>> Git toolbox, the user can choose to use it or the user may not
>>> have a need to interact with it, but it exists there by default
>>> and is not an opt-in add-on component.
>>>
>>> I would understand it if your plan is to convert to a unified
>>> build procedure at the very end of the upstreaming process, and
>>> not while you populate contrib/ with more and more scalar stuff,
>>> because the Makefile bits for the entire scalar, while not yet
>>> upstreamed, has already been written as a separate procedure and
>>> having to convert the whole thing upfront before you can start
>>> trickle parts would mean you need to (re)start the process.  And
>>> I would even be sympathetic if you felt it like a distraction.
>>>
>>> But at least I view it as a step that needs to happen sometime
>>> between now and at the end.  I do not yet have an opinion on
>>> which one is more pleasant, between (1) having to deal with a
>>> single Makefile that needs to be aware of two different locations
>>> *.[ch] lives in, and (2) having to deal with two Makefiles that
>>> duplicates definitions and risks them needlessly diverging.
>
> Since we already need to modify the root Makefile, I think having
> the root Makefile add the files from contrib/scalar from an
> optional flag is a great way to reduce duplication across multiple
> Makefiles while also maintaining the Scalar is compiled optionally.
>
> One big goal is to minimize how often we need to update Scalar. I
> can see things like adjusting the recommended config once per
> release cycle based on which new features are available. I don't
> really want to be spending time updating the Makefile to match a
> contribution that was already carefully reviewed and tested. I
> also don't want to put the burden of updating contrib/scalar upon
> those contributors.

I'd think not having large parts of t/Makefile & Makefile should ease
that maintenance burden for you & others.

>> For what it's worth what I had on top of this is not (1) or (2), but a
>> (0): I.e. there isn't a contrib/scalar anymore, I moved:
>> 
>>     contrib/scalar/scalar.c -> scalar>     contrib/scalar/scalar.txt -> Documentation/scalar.txt
>>     contrib/scalar/t9099-scalar.sh -> t/t9099-scalar.sh
>> 
>> We build, test, and otherwise check (e.g. "make check-docs") it by
>> default, what we don't do is install it unless you ask. You need to run:
>> 
>>     # Or any other install* target
>>     make install install-doc INSTALL_SCALAR=YesPlease
>> 
>> It could be be kept in contrib/scalar/ even with that sort of approach,
>> and it would still be simpler than the two-Makefile approach.
>
> I think keeping it in contrib/scalar is best for now. But I do
> agree that a single Makefile has benefits.

I noted the "why contrib" above.

> One early suggestion from a while back was to modify git.c to
> handle the "scalar" executable as well as the "git" executable,
> specifically to reduce duplication handling options such as
>
>   -c config.key=value
>   -C worktree
>   --exec-path
>
> and similar commands. While our duplication of the "-c" option
> does add similar code in a second place, these other options
> are less critical for Scalar, especially in its current version.
> I think refactoring the code in git.c to cater to the "scalar"
> executable is at least premature. If we want to pursue these
> other options in the future, then that refactoring could happen
> as a separate discussion after the rest of the build system and
> CLI have been figured out.

As noted above that seems like a sensible way forward, I hadn't noticed
how much of git.c's setup was copied into scalar.c.

It seems to me that it wouldn't be that hard, on the order of the
existing setup code or less. I.e. just make "git.c" learn that it may be
running some arbitrary command name, and do some options parsing, but
and finally dispatch to a cmd_scalar(). IOW mostly like a built-in.

> _Perhaps_ Johannes still had that level of integration in his
> head when responding to the single-Makefile recommendations.
>
>> But just moving the code, tests and documentation where everything else
>> lives cuts down an all sorts of special cases, file globs in various
>> places (e.g. doc lints) will just work and won't need adjustment.
>> 
>>> I also would understand it if the reason why you want to keep the
>>> top-level Makefile as intact as possible because you sense a high
>>> probability that scalar will stay in contrib/ and even turn out
>>> to be a failure.  Keeping the build procedure separated certainly
>>> will keep it easier to yank it out later.  But I do not think
>>> such a case is quite likely.
>> 
>> For what it's worth the WIP patch(es) I have on top of it will probably
>> make such a thing even easier, not that removing it from the tree would
>> be much of a problem in either case. It's mostly a few lines added to
>> lists in various places in the Makfile.
>
> Do you have a version of these patches available for adaptation
> into this series? I'd like to take a look and see what it would
> look like to squash them into this series. Forgive me if I just
> missed the link. (I see the diff you posted earlier in this thread.)

I've got it at
https://github.com/avar/git/tree/avar-dscho/scalar-the-beginning-normalize-Makefile

Not very ML-ready, and soft-depends on some other Makefile cleanups I
thought I'd do & still haven't untangled and submitted. Soft-depends as
in this can easily be done on master, but some of the variable names
etc. are quite confusing there.

But you should be able to check it out, it'll build, test and install if
you run "install" with "INSTALL_SCALAR=Y".

You may run into on everly eager new (but unrelated to this, I just
merged it on top) Makefile assertion I'm experimenting with, just
comment out the relevant line in the Makefile if that happens,
i.e. something like this error:

    Makefile:3608: *** "please sort and de-duplicate BUILT_INS_EXTRA!".  Stop.

>> If I were to clean this up properly most of the changes would be
>> teaching the Makefile that it can build N number of named top-level
>> "special" commands that get dropped into bin/, not just the "git" we
>> hardcode now.
>
> This is an interesting idea for revamping how adjacent tools are
> compiled and shipped with Git from contrib/ (or possibly elsewhere
> if we decided to start including more things as "blessed helpers".
>
> As a complete aside: I'm interested in using the sparse-checkout
> feature as I work on the Git codebase, just to make sure I hit
> pain points before any other user.
>
> This is the best that I could do for my purposes:
>
> $ git sparse-checkout list
> .github
> Documentation
> builtin
> compat
> contrib/scalar
> ewah
> git-gui
> gitk-git
> gitk-gui
> gitweb
> mergetools
> negotiator
> perl
> po
> refs
> sha1dc
> sha256
> t
> templates
> trace2
> xdiff
>
> And 'git status' reports that this includes 97% of the tracked
> files. Perhaps there are ways to make this be smaller by having
> make skip building things like git-gui if the directory doesn't
> exist. Another idea would be to skip any logic around translating
> messages if the 'po' directory is missing.

For git-gui in particular NO_TCLTK=Y should do it.

> The reason I bring this up is that I'm interested in finding
> ways to make our build system be streamlined a bit using the
> presence of directories as a way to opt in/out of certain build
> outputs. Since Scalar is being added as a new component, this is
> a good opportunity to establish a pattern that works for this
> effort, too.

Sure, the hard part isn't that you can't grep out nonexisting files or
directories when building where we now use a glob.

It's that everything downstream of that, i.e. tests, installation
etc. is going to have to work properly in the face of arbitrary parts of
what the developer who tested the code expected going missing.

Which is why we've generally carved out very specific things, usually
along the boundaries of installed dependencies.
Johannes Schindelin Oct. 6, 2021, 8:09 p.m. UTC | #13
Hi Junio,

On Tue, 14 Sep 2021, Junio C Hamano wrote:

> An alternative would be to bypass the contrib/ phase and start as a
> new subcommand that is first-class citizen from day one and let it
> spend as much time as it needs to mature.

I don't think that there is a lot of sense in that. The main benefits of
`scalar` are in the `register` and the `clone` part, and the most natural
end game would hence be for `git init` and `git clone` to sprout new
options to support Scalar's features, in a Git-native way.

As I have explained earlier, the `scalar` command has existing users, and
therefore its command-line interface is not up for discussion (for
example, turning `scalar` into `git scalar` would be a usability
disaster). Scalar's _functionality_, however, should make it into Git
proper. Into existing built-ins, that is.

So I don't think that the contrib/ phase can be by-passed. It would not
make sense to port Scalar to a new builtin. To the contrary,
contrib/scalar/ should be the final destination for the `scalar` command.
And you can't bypass a final destination. That simply makes no sense.

So why bother with contrib/ at all? you may ask. The reason is that it
makes it substantially easier for me to move the features into core Git,
as I can incrementally implement those new options for Git's built-ins,
use them in `contrib/scalar/` instead of duplicating the functionality,
and then make use of Scalar's Functional Test suite for a much more
comprehensive testing (which has served us already really well in the
past). It also doesn't hurt that this way, my day job will be very happy
because Scalar users directly benefit from that work.

Of course, these suggestions to integrate Scalar more into the core part
of Git (missing the point that the final destination for the functionality
is not a new built-in, but rather new options for existing built-ins) made
everything much more cumbersome for me instead, for no gain that would be
apparent to me, impeding on aforementioned ease to move the features into
core Git (which has not happened yet, as a consequence), but hopefully
this will soon be a thing of the past.

So I would like to request that we close the discussion about the question
whether to integrate Scalar more into the top-level Makefile or into
git.c, and instead go ahead with keeping the `scalar` command in
contrib/scalar/. The freed-up time can then be used to focus on the much
more rewarding project of upstreaming Scalar's functionality such as
teaching `git clone` a short-and-sweet option that Just Makes Sense for
large monorepos (i.e. that imitates at least a large part of what `scalar
clone` does right now).

Ciao,
Dscho
Junio C Hamano Oct. 6, 2021, 8:25 p.m. UTC | #14
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I don't think that there is a lot of sense in that. The main benefits of
> `scalar` are in the `register` and the `clone` part, and the most natural
> end game would hence be for `git init` and `git clone` to sprout new
> options to support Scalar's features, in a Git-native way.

Yes, that is even better.  An endgame where everybody benefits
natively would be highly desirable.

Now you are back, do you think we can have the "no more preserve
merges backend" topic graduate to 'master', or do you prefer to cook
it over the cycle (or even two)?

Thanks.
Ævar Arnfjörð Bjarmason Oct. 7, 2021, 1:03 a.m. UTC | #15
On Wed, Oct 06 2021, Johannes Schindelin wrote:

> On Tue, 14 Sep 2021, Junio C Hamano wrote:
>
>> An alternative would be to bypass the contrib/ phase and start as a
>> new subcommand that is first-class citizen from day one and let it
>> spend as much time as it needs to mature.
>
> I don't think that there is a lot of sense in that. The main benefits of
> `scalar` are in the `register` and the `clone` part, and the most natural
> end game would hence be for `git init` and `git clone` to sprout new
> options to support Scalar's features, in a Git-native way.
>
> As I have explained earlier, the `scalar` command has existing users,
> and therefore its command-line interface is not up for discussion (for
> example, turning `scalar` into `git scalar` would be a usability
> disaster). Scalar's _functionality_, however, should make it into Git
> proper. Into existing built-ins, that is.

Given the sub-thread this seems like it's trying to be an indirect reply
to me, but I haven't been advocating changing the UX of "scalar" in any
way, or that we should have a "git scalar".

I have in fact been advocating exactly what you're advocating here. So
we're in violent agreement. Yes the CLI UI shouldn't change, that's the
whole point of having a "scalar" in git.git, not a "git scalar" or
whatever.

I've been suggesting we can simplify the *build system* git.git uses,
something no user will ever see.

> So I don't think that the contrib/ phase can be by-passed. It would not
> make sense to port Scalar to a new builtin. To the contrary,
> contrib/scalar/ should be the final destination for the `scalar` command.
> And you can't bypass a final destination. That simply makes no sense.

I'm right now using a "scalar" installed on my system based on the diff
at the end of this E-Mail that changes the *build system* without any
user-facing changes (see
https://lore.kernel.org/git/87k0jhn0p9.fsf@evledraar.gmail.com/) for a
previous reference.
    
    $ which scalar
    /home/avar/local/bin/scalar
    $ scalar -h 2>&1 | head -n 1
    usage: scalar [-C <directory>] [-c <key>=<value>] <command> [<options>]
    $ man scalar
    [...]
    NAME
           scalar - an opinionated repository management tool
    $ git help scalar
    No manual entry for gitscalar

Except that I think your patches as they stand (correct me if I'm wrong)
don't have any way to install it or its documentation, just to build it
in-place.

> So why bother with contrib/ at all? you may ask. The reason is that it
> makes it substantially easier for me to move the features into core Git,
> as I can incrementally implement those new options for Git's built-ins,
> use them in `contrib/scalar/` instead of duplicating the functionality,
> and then make use of Scalar's Functional Test suite for a much more
> comprehensive testing (which has served us already really well in the
> past). It also doesn't hurt that this way, my day job will be very happy
> because Scalar users directly benefit from that work.

I think all of that summarizes "why have this live in git.git", which I
100% agree with. It's not a counter-argument to a working solution for
simplifying your proposed build system integration.

> Of course, these suggestions to integrate Scalar more into the core part
> of Git (missing the point that the final destination for the functionality
> is not a new built-in, but rather new options for existing built-ins) made
> everything much more cumbersome for me instead, for no gain that would be
> apparent to me, impeding on aforementioned ease to move the features into
> core Git (which has not happened yet, as a consequence), but hopefully
> this will soon be a thing of the past.

The diff below doesn't make scalar a built-in.

> So I would like to request that we close the discussion about the question
> whether to integrate Scalar more into the top-level Makefile or into
> git.c, and instead go ahead with keeping the `scalar` command in
> contrib/scalar/. The freed-up time can then be used to focus on the much
> more rewarding project of upstreaming Scalar's functionality such as
> teaching `git clone` a short-and-sweet option that Just Makes Sense for
> large monorepos (i.e. that imitates at least a large part of what `scalar
> clone` does right now).

The reason I care about this is because duplicating this as a one-off
may be easier for you now, but it creates a lot of maintenance burden
for others down the line.

For example, I've been fixing memory leaks recently and have a
linux-leaks CI job that's now running on GitHub. Scalar would benefit
from having a t/t9099-scalar.sh run as part of that.

Yes you can special-case it somehow and teach ci/, or even "make test"
to do the same thing. But there's a lot of little things both current
and future that are going to be like that. Here's another one:

    $ make -C contrib/scalar scalar
    $ echo bad >cache.h
    $ make -C contrib/scalar scalar
    make: 'scalar' is up to date.

I.e. because it's using its own Makefile it's not getting the very
basics of the dependency graph right. My version?

    $ make scalar
    $ touch cache.h
    $ make scalar
    [... Works exactly as well as doing the same with "git"...]

Yes we could fix that and other things etc, but why not just get all
that for free in around 1/4 lines of Makefile boilerplate (and less than
that in complexity)?

 .gitignore                                   |  1 +
 Documentation/Makefile                       |  3 ++
 {contrib/scalar => Documentation}/scalar.txt |  4 +-
 Makefile                                     | 44 +++++++++++-----
 contrib/scalar/.gitignore                    |  5 --
 contrib/scalar/Makefile                      | 57 --------------------
 contrib/scalar/t/Makefile                    | 78 ----------------------------
 contrib/scalar/scalar.c => scalar.c          |  0
 {contrib/scalar/t => t}/t9099-scalar.sh      |  8 +--
 9 files changed, 37 insertions(+), 163 deletions(-)

[Note that this probably doesn't fully apply to your series/master, not
because of any great textual/semantic conflict, but just because I find
it convenient to stack my WIP/unsubmitted serieses in related areas on
top of one another. This is on top of other Makefile changes I've got
unsubmitted, but it would be easy/trivial to re-apply on master+your
series].

diff --git a/.gitignore b/.gitignore
index 68ecb7f7e9c..ec9771e3532 100644
--- a/.gitignore
+++ b/.gitignore
@@ -217,6 +217,7 @@
 /configure
 /.vscode/
 /tags
+/scalar
 /TAGS
 /cscope*
 /compile_commands.json
diff --git a/Documentation/Makefile b/Documentation/Makefile
index f5605b7767f..57b72e1999a 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -18,6 +18,9 @@ MAN1_TXT += $(filter-out \
 MAN1_TXT += git.txt
 MAN1_TXT += gitk.txt
 MAN1_TXT += gitweb.txt
+ifndef NO_INSTALL_SCALAR_DOC
+MAN1_TXT += scalar.txt
+endif
 
 # man5 / man7 guides (note: new guides should also be added to command-list.txt)
 MAN5_TXT += gitattributes.txt
diff --git a/contrib/scalar/scalar.txt b/Documentation/scalar.txt
similarity index 99%
rename from contrib/scalar/scalar.txt
rename to Documentation/scalar.txt
index 3a80f829edc..f287ab18c31 100644
--- a/contrib/scalar/scalar.txt
+++ b/Documentation/scalar.txt
@@ -149,6 +149,6 @@ SEE ALSO
 --------
 linkgit:git-clone[1], linkgit:git-maintenance[1].
 
-Scalar
+GIT
 ---
-Associated with the linkgit:git[1] suite
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index fd617f8c775..536028922e1 100644
--- a/Makefile
+++ b/Makefile
@@ -602,6 +602,7 @@ LIB_OBJS =
 LIB_OBJS_NO_COMPAT_OBJS =
 OBJECTS =
 PROGRAM_OBJS =
+SCALAR_OBJS =
 SCRIPT_LIB =
 SCRIPT_PERL =
 SCRIPT_PROGRAMS =
@@ -807,6 +808,7 @@ BUILT_INS += $(BUILT_INS_EXTRA)
 
 # what 'all' will build but not install in gitexecdir
 OTHER_PROGRAMS = git$X
+OTHER_PROGRAMS += scalar$X
 ARTIFACTS_TAR += $(OTHER_PROGRAMS)
 
 # what test wrappers are needed and 'install' will install, in bindir
@@ -818,12 +820,18 @@ BINDIR_PROGRAMS_NEED_X += git-upload-pack
 
 BINDIR_PROGRAMS_NO_X += git-cvsserver
 
+ifdef INSTALL_SCALAR
+BINDIR_PROGRAMS_NEED_X += scalar
+endif
 INSTALL_BINDIR_XPROGRAMS += $(patsubst %,%$X,$(BINDIR_PROGRAMS_NEED_X))
 INSTALL_BINDIR_PROGRAMS += $(INSTALL_BINDIR_XPROGRAMS) $(BINDIR_PROGRAMS_NO_X)
 
 # We have bin-wrappers for programs that we don't install
 TEST_BINDIR_PROGRAMS_NEED_X += $(BINDIR_PROGRAMS_NEED_X)
 TEST_BINDIR_PROGRAMS_NEED_X += $(TEST_PROGRAMS_NEED_X)
+ifndef INSTALL_SCALAR
+TEST_BINDIR_PROGRAMS_NEED_X += scalar$X
+endif
 
 TEST_BINDIR_PROGRAMS += $(TEST_BINDIR_PROGRAMS_NEED_X)
 TEST_BINDIR_PROGRAMS += $(BINDIR_PROGRAMS_NO_X)
@@ -2230,7 +2238,7 @@ please_set_SHELL_PATH_to_a_more_modern_shell:
 
 shell_compatibility_test: please_set_SHELL_PATH_to_a_more_modern_shell
 
-strip: $(BIN_PROGRAMS) git$X
+strip: $(BIN_PROGRAMS) git$X scalar$X
 	$(STRIP) $(STRIP_OPTS) $^
 
 ### Flags affecting all rules
@@ -2286,6 +2294,10 @@ git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
 		$(filter %.o,$^) $(LIBS)
 
+scalar$X: scalar.o GIT-LDFLAGS $(GITLIBS)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
+		$(filter %.o,$^) $(LIBS)
+
 help.sp help.s help.o: command-list.h
 
 builtin/help.sp builtin/help.s builtin/help.o: config-list.h GIT-PREFIX
@@ -2557,7 +2569,12 @@ GIT_OBJS += git.o
 .PHONY: git-objs
 git-objs: $(GIT_OBJS)
 
+SCALAR_OBJS += scalar.o
+.PHONY: scalar-objs
+scalar-objs: $(SCALAR_OBJS)
+
 OBJECTS += $(GIT_OBJS)
+OBJECTS += $(SCALAR_OBJS)
 OBJECTS += $(PROGRAM_OBJS)
 OBJECTS += $(TEST_OBJS)
 OBJECTS += $(XDIFF_OBJS)
@@ -2565,13 +2582,10 @@ OBJECTS += $(FUZZ_OBJS)
 ifndef NO_CURL
 	OBJECTS += http.o http-walker.o remote-curl.o
 endif
+OBJECTS += $(SCALAR_OBJECTS)
 .PHONY: objects
 objects: $(OBJECTS)
 
-SCALAR_SOURCES := contrib/scalar/scalar.c
-SCALAR_OBJECTS := $(SCALAR_SOURCES:c=o)
-OBJECTS += $(SCALAR_OBJECTS)
-
 dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
 dep_dirs := $(addsuffix .depend,$(sort $(dir $(OBJECTS))))
 
@@ -2710,10 +2724,6 @@ $(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS $(GITLIBS
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
 
-contrib/scalar/scalar$X: $(SCALAR_OBJECTS) GIT-LDFLAGS $(GITLIBS)
-	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
-		$(filter %.o,$^) $(LIBS)
-
 $(LIB_FILE): $(LIB_OBJS) $(LIB_COMPAT_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
@@ -3260,11 +3270,17 @@ endif
 install-gitweb:
 	$(MAKE) -C gitweb install
 
+ifdef INSTALL_SCALAR
+NO_INSTALL_SCALAR_DOC =
+else
+NO_INSTALL_SCALAR_DOC = NoScalarPlease
+endif
+
 install-doc: install-man-perl
-	$(MAKE) -C Documentation install
+	$(MAKE) -C Documentation install NO_INSTALL_SCALAR_DOC=$(NO_INSTALL_SCALAR_DOC)
 
 install-man: install-man-perl
-	$(MAKE) -C Documentation install-man
+	$(MAKE) -C Documentation install-man NO_INSTALL_SCALAR_DOC=$(NO_INSTALL_SCALAR_DOC)
 
 install-man-perl: man-perl
 	$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(mandir_SQ)/man3'
@@ -3272,13 +3288,13 @@ install-man-perl: man-perl
 	(cd '$(DESTDIR_SQ)$(mandir_SQ)/man3' && umask 022 && $(TAR) xof -)
 
 install-html:
-	$(MAKE) -C Documentation install-html
+	$(MAKE) -C Documentation install-html NO_INSTALL_SCALAR_DOC=$(NO_INSTALL_SCALAR_DOC)
 
 install-info:
-	$(MAKE) -C Documentation install-info
+	$(MAKE) -C Documentation install-info NO_INSTALL_SCALAR_DOC=$(NO_INSTALL_SCALAR_DOC)
 
 install-pdf:
-	$(MAKE) -C Documentation install-pdf
+	$(MAKE) -C Documentation install-pdf NO_INSTALL_SCALAR_DOC=$(NO_INSTALL_SCALAR_DOC)
 
 quick-install-doc:
 	$(MAKE) -C Documentation quick-install
diff --git a/contrib/scalar/.gitignore b/contrib/scalar/.gitignore
deleted file mode 100644
index 00441073f59..00000000000
--- a/contrib/scalar/.gitignore
+++ /dev/null
@@ -1,5 +0,0 @@
-/*.xml
-/*.1
-/*.html
-/*.exe
-/scalar
diff --git a/contrib/scalar/Makefile b/contrib/scalar/Makefile
deleted file mode 100644
index 8620042f281..00000000000
--- a/contrib/scalar/Makefile
+++ /dev/null
@@ -1,57 +0,0 @@
-QUIET_SUBDIR0  = +$(MAKE) -C # space to separate -C and subdir
-QUIET_SUBDIR1  =
-
-ifneq ($(findstring s,$(MAKEFLAGS)),s)
-ifndef V
-	QUIET_GEN      = @echo '   ' GEN $@;
-	QUIET_SUBDIR0  = +@subdir=
-	QUIET_SUBDIR1  = ;$(NO_SUBDIR) echo '   ' SUBDIR $$subdir; \
-			 $(MAKE) $(PRINT_DIR) -C $$subdir
-	QUIET          = @
-else
-	export V
-endif
-endif
-
-all:
-
-include ../../config.mak.uname
--include ../../config.mak.autogen
--include ../../config.mak
-
-TARGETS = scalar$(X) scalar.o
-GITLIBS = ../../common-main.o ../../libgit.a ../../xdiff/lib.a
-
-all: scalar$X ../../bin-wrappers/scalar
-
-$(GITLIBS):
-	$(QUIET_SUBDIR0)../.. $(QUIET_SUBDIR1) $(subst ../../,,$@)
-
-$(TARGETS): $(GITLIBS) scalar.c
-	$(QUIET_SUBDIR0)../.. $(QUIET_SUBDIR1) $(patsubst %,contrib/scalar/%,$@)
-
-clean:
-	$(RM) $(TARGETS) ../../bin-wrappers/scalar
-	$(RM) scalar.1 scalar.html scalar.xml
-
-../../bin-wrappers/scalar: ../../wrap-for-bin.sh Makefile
-	@mkdir -p ../../bin-wrappers
-	$(QUIET_GEN)sed -e '1s|#!.*/sh|#!$(SHELL_PATH_SQ)|' \
-	     -e 's|@@BUILD_DIR@@|$(shell cd ../.. && pwd)|' \
-	     -e 's|@@PROG@@|contrib/scalar/scalar$(X)|' < $< > $@ && \
-	chmod +x $@
-
-test: all
-	$(MAKE) -C t
-
-docs: scalar.html scalar.1
-
-scalar.html: | scalar.1 # prevent them from trying to build `doc.dep` in parallel
-
-scalar.html scalar.1: scalar.txt
-	$(QUIET_SUBDIR0)../../Documentation$(QUIET_SUBDIR1) \
-		MAN_TXT=../contrib/scalar/scalar.txt \
-		../contrib/scalar/$@
-	$(QUIET)test scalar.1 != "$@" || mv ../../Documentation/$@ .
-
-.PHONY: all clean docs test FORCE
diff --git a/contrib/scalar/t/Makefile b/contrib/scalar/t/Makefile
deleted file mode 100644
index 6170672bb37..00000000000
--- a/contrib/scalar/t/Makefile
+++ /dev/null
@@ -1,78 +0,0 @@
-# Run scalar tests
-#
-# Copyright (c) 2005,2021 Junio C Hamano, Johannes Schindelin
-#
-
--include ../../../config.mak.autogen
--include ../../../config.mak
-
-SHELL_PATH ?= $(SHELL)
-PERL_PATH ?= /usr/bin/perl
-RM ?= rm -f
-PROVE ?= prove
-DEFAULT_TEST_TARGET ?= test
-TEST_LINT ?= test-lint
-
-ifdef TEST_OUTPUT_DIRECTORY
-TEST_RESULTS_DIRECTORY = $(TEST_OUTPUT_DIRECTORY)/test-results
-else
-TEST_RESULTS_DIRECTORY = ../../../t/test-results
-endif
-
-# Shell quote;
-SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))
-PERL_PATH_SQ = $(subst ','\'',$(PERL_PATH))
-TEST_RESULTS_DIRECTORY_SQ = $(subst ','\'',$(TEST_RESULTS_DIRECTORY))
-
-T = $(sort $(wildcard t[0-9][0-9][0-9][0-9]-*.sh))
-
-all: $(DEFAULT_TEST_TARGET)
-
-test: $(TEST_LINT)
-	$(MAKE) aggregate-results-and-cleanup
-
-prove: $(TEST_LINT)
-	@echo "*** prove ***"; GIT_CONFIG=.git/config $(PROVE) --exec '$(SHELL_PATH_SQ)' $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS)
-	$(MAKE) clean-except-prove-cache
-
-$(T):
-	@echo "*** $@ ***"; GIT_CONFIG=.git/config '$(SHELL_PATH_SQ)' $@ $(GIT_TEST_OPTS)
-
-clean-except-prove-cache:
-	$(RM) -r 'trash directory'.* '$(TEST_RESULTS_DIRECTORY_SQ)'
-	$(RM) -r valgrind/bin
-
-clean: clean-except-prove-cache
-	$(RM) .prove
-
-test-lint: test-lint-duplicates test-lint-executable test-lint-shell-syntax
-
-test-lint-duplicates:
-	@dups=`echo $(T) | tr ' ' '\n' | sed 's/-.*//' | sort | uniq -d` && \
-		test -z "$$dups" || { \
-		echo >&2 "duplicate test numbers:" $$dups; exit 1; }
-
-test-lint-executable:
-	@bad=`for i in $(T); do test -x "$$i" || echo $$i; done` && \
-		test -z "$$bad" || { \
-		echo >&2 "non-executable tests:" $$bad; exit 1; }
-
-test-lint-shell-syntax:
-	@'$(PERL_PATH_SQ)' ../../../t/check-non-portable-shell.pl $(T)
-
-aggregate-results-and-cleanup: $(T)
-	$(MAKE) aggregate-results
-	$(MAKE) clean
-
-aggregate-results:
-	for f in '$(TEST_RESULTS_DIRECTORY_SQ)'/t*-*.counts; do \
-		echo "$$f"; \
-	done | '$(SHELL_PATH_SQ)' ../../../t/aggregate-results.sh
-
-valgrind:
-	$(MAKE) GIT_TEST_OPTS="$(GIT_TEST_OPTS) --valgrind"
-
-test-results:
-	mkdir -p test-results
-
-.PHONY: $(T) aggregate-results clean valgrind
diff --git a/contrib/scalar/scalar.c b/scalar.c
similarity index 100%
rename from contrib/scalar/scalar.c
rename to scalar.c
diff --git a/contrib/scalar/t/t9099-scalar.sh b/t/t9099-scalar.sh
similarity index 94%
rename from contrib/scalar/t/t9099-scalar.sh
rename to t/t9099-scalar.sh
index 7e8771d0eff..f686ba5d84c 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/t/t9099-scalar.sh
@@ -2,13 +2,7 @@
 
 test_description='test the `scalar` command'
 
-TEST_DIRECTORY=$PWD/../../../t
-export TEST_DIRECTORY
-
-# Make it work with --no-bin-wrappers
-PATH=$PWD/..:$PATH
-
-. ../../../t/test-lib.sh
+. ./test-lib.sh
 
 GIT_TEST_MAINT_SCHEDULER="crontab:test-tool crontab ../cron.txt"
 export GIT_TEST_MAINT_SCHEDULER
Johannes Schindelin Oct. 7, 2021, 10:58 a.m. UTC | #16
Hi Junio,

On Wed, 6 Oct 2021, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > I don't think that there is a lot of sense in that. The main benefits of
> > `scalar` are in the `register` and the `clone` part, and the most natural
> > end game would hence be for `git init` and `git clone` to sprout new
> > options to support Scalar's features, in a Git-native way.
>
> Yes, that is even better.  An endgame where everybody benefits
> natively would be highly desirable.

I am glad that this question is now resolved.

> Now you are back, do you think we can have the "no more preserve
> merges backend" topic graduate to 'master', or do you prefer to cook
> it over the cycle (or even two)?

I did not see _any_ problems with it, so I'd be in favor of promoting it
to `master`. But if you want to play the game more carefully, I would be
fine with that, too.

Ciao,
Dscho