Message ID | pull.745.git.1602074589460.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | help: do not expect built-in commands to be hardlinked | expand |
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > When building with SKIP_DASHED_BUILT_INS=YesPlease, the built-in > commands are no longer present in the `PATH` as hardlinks to `git`. > > As a consequence, `load_command_list()` needs to be taught to find the > names of the built-in commands from elsewhere. > > This only affected the output of `git --list-cmds=main`, but not the > output of `git help -a` because the latter includes the built-in > commands by virtue of them being listed in command-list.txt. > > The bug was detected via a patch series that turns the merge strategies > included in Git into built-in commands: `git merge -s help` relies on > `load_command_list()` to determine the list of available merge > strategies. > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > --- > Fix the command list with SKIP_DASHED_BUILT_INS=YesPlease > > In a recent patch series > [https://lore.kernel.org/git/20201005122646.27994-12-alban.gruin@gmail.com/#r] > , the merge strategies were converted into built-ins, which is good. > > Together with the change where we stop hard-linking the built-in > commands in CI builds, this broke t9902.199. > > The actual root cause is that git merge -s help relies on > load_command_list() to find all available Git commands, and that > function had the long-standing bug that it expects the built-in commands > to be available in the PATH. > That is not a bug in "merge -s help" or "longstanding" at all. It has been a quite natural and long-standing expectation to find all the merge strategies on PATH (after GIT_EXEC_PATH is added to it), because that was the promise we gave to our users long time ago and have kept. The bug is in load_command_list() and it was introduced by the recent SKIP_DASHED_BUILT_INS series. We forgot to teach the function that in the new world order, what we see on disk plus what we have in the built-in table are the set of subcommands available to us, and the rule that was valid in the old world order can no longer be relied upon, and nobody noticed the breakage while developing or reviewing. > git.c | 13 +++++++++++++ > help.c | 2 ++ > help.h | 1 + > 3 files changed, 16 insertions(+) > > diff --git a/git.c b/git.c > index d51fb5d2bf..a6224badce 100644 > --- a/git.c > +++ b/git.c > @@ -641,6 +641,19 @@ static void list_builtins(struct string_list *out, unsigned int exclude_option) > } > } > > +void load_builtin_commands(const char *prefix, struct cmdnames *cmds) > +{ > + const char *name; > + int i; > + > + if (!skip_prefix(prefix, "git-", &prefix)) > + return; Do we want to explain that this is for dropping "gitk" and the like in a comment near here? > + for (i = 0; i < ARRAY_SIZE(commands); i++) > + if (skip_prefix(commands[i].cmd, prefix, &name)) > + add_cmdname(cmds, name, strlen(name)); > +} > + > #ifdef STRIP_EXTENSION > static void strip_extension(const char **argv) > { > diff --git a/help.c b/help.c > index 4e2468a44d..919cbb9206 100644 > --- a/help.c > +++ b/help.c > @@ -263,6 +263,8 @@ void load_command_list(const char *prefix, > const char *env_path = getenv("PATH"); > const char *exec_path = git_exec_path(); > > + load_builtin_commands(prefix, main_cmds); > + > if (exec_path) { > list_commands_in_dir(main_cmds, exec_path, prefix); > QSORT(main_cmds->names, main_cmds->cnt, cmdname_compare); > diff --git a/help.h b/help.h > index dc02458855..5871e93ba2 100644 > --- a/help.h > +++ b/help.h > @@ -32,6 +32,7 @@ const char *help_unknown_cmd(const char *cmd); > void load_command_list(const char *prefix, > struct cmdnames *main_cmds, > struct cmdnames *other_cmds); > +void load_builtin_commands(const char *prefix, struct cmdnames *cmds); > void add_cmdname(struct cmdnames *cmds, const char *name, int len); > /* Here we require that excludes is a sorted list. */ > void exclude_cmds(struct cmdnames *cmds, struct cmdnames *excludes); > > base-commit: 8f7759d2c8c13716bfdb9ae602414fd987787e8d
Junio C Hamano <gitster@pobox.com> writes: > ... in the new world order, what we see on disk plus what > we have in the built-in table are the set of subcommands available > to us, and the rule that was valid in the old world order can no > longer be relied upon, and nobody noticed the breakage while > developing or reviewing. >> diff --git a/help.c b/help.c >> index 4e2468a44d..919cbb9206 100644 >> --- a/help.c >> +++ b/help.c >> @@ -263,6 +263,8 @@ void load_command_list(const char *prefix, >> const char *env_path = getenv("PATH"); >> const char *exec_path = git_exec_path(); >> >> + load_builtin_commands(prefix, main_cmds); >> + >> if (exec_path) { >> list_commands_in_dir(main_cmds, exec_path, prefix); >> QSORT(main_cmds->names, main_cmds->cnt, cmdname_compare); I wondered if we need, after this change, to worry about duplicates, because some Git subcommands, even after they made into a built-in and callable internally, must have on-disk footprint. It turns out that after the post-context in this hunk we do make a call to uniq(main_cmds) so it is fine. This was unexpected to me, as we read only from a single directory "exec_path" and the need to call uniq() in the old world order would have meant that readdir in exec_path gave us duplicate entries. In fact, the very original version of load_command_list() did not have this unnecessary call to uniq(). It was introduced in 1f08e5ce (Allow git help work without PATH set, 2008-08-28); perhaps Alex saw 12 years into the future and predicted that we would start needing it ;-) In any case, the patch is good thanks to that existing uniq() call. >> diff --git a/help.h b/help.h >> index dc02458855..5871e93ba2 100644 >> --- a/help.h >> +++ b/help.h >> @@ -32,6 +32,7 @@ const char *help_unknown_cmd(const char *cmd); >> void load_command_list(const char *prefix, >> struct cmdnames *main_cmds, >> struct cmdnames *other_cmds); >> +void load_builtin_commands(const char *prefix, struct cmdnames *cmds); >> void add_cmdname(struct cmdnames *cmds, const char *name, int len); >> /* Here we require that excludes is a sorted list. */ >> void exclude_cmds(struct cmdnames *cmds, struct cmdnames *excludes); >> >> base-commit: 8f7759d2c8c13716bfdb9ae602414fd987787e8d
Hi Junio, On Wed, 7 Oct 2020, Junio C Hamano wrote: > "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> > writes: > > > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > > > When building with SKIP_DASHED_BUILT_INS=YesPlease, the built-in > > commands are no longer present in the `PATH` as hardlinks to `git`. > > > > As a consequence, `load_command_list()` needs to be taught to find the > > names of the built-in commands from elsewhere. > > > > This only affected the output of `git --list-cmds=main`, but not the > > output of `git help -a` because the latter includes the built-in > > commands by virtue of them being listed in command-list.txt. > > > > The bug was detected via a patch series that turns the merge strategies > > included in Git into built-in commands: `git merge -s help` relies on > > `load_command_list()` to determine the list of available merge > > strategies. > > > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > > --- > > Fix the command list with SKIP_DASHED_BUILT_INS=YesPlease > > > > In a recent patch series > > [https://lore.kernel.org/git/20201005122646.27994-12-alban.gruin@gmail.com/#r] > > , the merge strategies were converted into built-ins, which is good. > > > > Together with the change where we stop hard-linking the built-in > > commands in CI builds, this broke t9902.199. > > > > The actual root cause is that git merge -s help relies on > > load_command_list() to find all available Git commands, and that > > function had the long-standing bug that it expects the built-in commands > > to be available in the PATH. > > > > That is not a bug in "merge -s help" or "longstanding" at all. It > has been a quite natural and long-standing expectation to find all > the merge strategies on PATH (after GIT_EXEC_PATH is added to it), > because that was the promise we gave to our users long time ago and > have kept. Sure, we promised to the outside world that those built-ins would always be in the PATH, but we highly recommended against using dashed invocations. To me, that means that _internally_ we should have been more stringent about how we do things ourselves. In any case, your complaint isn't about the commit message, so I hope it can advance? > The bug is in load_command_list() and it was introduced by the > recent SKIP_DASHED_BUILT_INS series. We forgot to teach the > function that in the new world order, what we see on disk plus what > we have in the built-in table are the set of subcommands available > to us, and the rule that was valid in the old world order can no > longer be relied upon, and nobody noticed the breakage while > developing or reviewing. > > > git.c | 13 +++++++++++++ > > help.c | 2 ++ > > help.h | 1 + > > 3 files changed, 16 insertions(+) > > > > diff --git a/git.c b/git.c > > index d51fb5d2bf..a6224badce 100644 > > --- a/git.c > > +++ b/git.c > > @@ -641,6 +641,19 @@ static void list_builtins(struct string_list *out, unsigned int exclude_option) > > } > > } > > > > +void load_builtin_commands(const char *prefix, struct cmdnames *cmds) > > +{ > > + const char *name; > > + int i; > > + > > + if (!skip_prefix(prefix, "git-", &prefix)) > > + return; > > Do we want to explain that this is for dropping "gitk" and the like > in a comment near here? I guess I have to explain this, as it is too easy to mistake this `skip_prefix()` to work on the actual command names rather than about the `prefix` parameter. The `commands[]` array in `git.c` stores only the command names, but `load_command_list()` is called with the prefix `git-` or `git-merge-`. Therefore, `load_builtin_commands()` skips the prefix `git-` *from the `prefix` itself*. I'll send the next iteration shortly. Ciao, Dscho > > > + for (i = 0; i < ARRAY_SIZE(commands); i++) > > + if (skip_prefix(commands[i].cmd, prefix, &name)) > > + add_cmdname(cmds, name, strlen(name)); > > +} > > + > > #ifdef STRIP_EXTENSION > > static void strip_extension(const char **argv) > > { > > diff --git a/help.c b/help.c > > index 4e2468a44d..919cbb9206 100644 > > --- a/help.c > > +++ b/help.c > > @@ -263,6 +263,8 @@ void load_command_list(const char *prefix, > > const char *env_path = getenv("PATH"); > > const char *exec_path = git_exec_path(); > > > > + load_builtin_commands(prefix, main_cmds); > > + > > if (exec_path) { > > list_commands_in_dir(main_cmds, exec_path, prefix); > > QSORT(main_cmds->names, main_cmds->cnt, cmdname_compare); > > diff --git a/help.h b/help.h > > index dc02458855..5871e93ba2 100644 > > --- a/help.h > > +++ b/help.h > > @@ -32,6 +32,7 @@ const char *help_unknown_cmd(const char *cmd); > > void load_command_list(const char *prefix, > > struct cmdnames *main_cmds, > > struct cmdnames *other_cmds); > > +void load_builtin_commands(const char *prefix, struct cmdnames *cmds); > > void add_cmdname(struct cmdnames *cmds, const char *name, int len); > > /* Here we require that excludes is a sorted list. */ > > void exclude_cmds(struct cmdnames *cmds, struct cmdnames *excludes); > > > > base-commit: 8f7759d2c8c13716bfdb9ae602414fd987787e8d >
Hi Junio, On Wed, 7 Oct 2020, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > ... in the new world order, what we see on disk plus what > > we have in the built-in table are the set of subcommands available > > to us, and the rule that was valid in the old world order can no > > longer be relied upon, and nobody noticed the breakage while > > developing or reviewing. > > >> diff --git a/help.c b/help.c > >> index 4e2468a44d..919cbb9206 100644 > >> --- a/help.c > >> +++ b/help.c > >> @@ -263,6 +263,8 @@ void load_command_list(const char *prefix, > >> const char *env_path = getenv("PATH"); > >> const char *exec_path = git_exec_path(); > >> > >> + load_builtin_commands(prefix, main_cmds); > >> + > >> if (exec_path) { > >> list_commands_in_dir(main_cmds, exec_path, prefix); > >> QSORT(main_cmds->names, main_cmds->cnt, cmdname_compare); > > I wondered if we need, after this change, to worry about duplicates, > because some Git subcommands, even after they made into a built-in > and callable internally, must have on-disk footprint. > > It turns out that after the post-context in this hunk we do make a > call to uniq(main_cmds) so it is fine. > > This was unexpected to me, as we read only from a single directory > "exec_path" and the need to call uniq() in the old world order would > have meant that readdir in exec_path gave us duplicate entries. > > In fact, the very original version of load_command_list() did not > have this unnecessary call to uniq(). It was introduced in 1f08e5ce > (Allow git help work without PATH set, 2008-08-28); perhaps Alex saw > 12 years into the future and predicted that we would start needing > it ;-) > > In any case, the patch is good thanks to that existing uniq() call. Yep, I was fully prepared to add that `uniq()` call and was surprised to find it. I guess it was "for good measure" because the same commit also added the same `qsort(); uniq()` combo another time, a little further down in that function. Now, what I would have expected you to say when you found the `uniq()` function is: Johannes, why don't you call `QSORT(); uniq()` after the call to `load_builtin_commands()`? After all, `exec_path` and `env_path` might both be `NULL`... Well, the answer to that question is _not_ "but without `env_path` nothing works anyway" even if that would be pretty valid. The answer is that the `commands[]` list in `git.c` is already sorted alphabetically. Thanks, Dscho > > >> diff --git a/help.h b/help.h > >> index dc02458855..5871e93ba2 100644 > >> --- a/help.h > >> +++ b/help.h > >> @@ -32,6 +32,7 @@ const char *help_unknown_cmd(const char *cmd); > >> void load_command_list(const char *prefix, > >> struct cmdnames *main_cmds, > >> struct cmdnames *other_cmds); > >> +void load_builtin_commands(const char *prefix, struct cmdnames *cmds); > >> void add_cmdname(struct cmdnames *cmds, const char *name, int len); > >> /* Here we require that excludes is a sorted list. */ > >> void exclude_cmds(struct cmdnames *cmds, struct cmdnames *excludes); > >> > >> base-commit: 8f7759d2c8c13716bfdb9ae602414fd987787e8d >
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Now, what I would have expected you to say when you found the `uniq()` > function is: Johannes, why don't you call `QSORT(); uniq()` after the call > to `load_builtin_commands()`? After all, `exec_path` and `env_path` might > both be `NULL`... Nah, you are expecting too much out of me. I didn't ask because I knew we didn't need to, and I didn't particularly care if you lucked out or if you had the same understanding as I had how you arrived at the right solution ;-)
diff --git a/git.c b/git.c index d51fb5d2bf..a6224badce 100644 --- a/git.c +++ b/git.c @@ -641,6 +641,19 @@ static void list_builtins(struct string_list *out, unsigned int exclude_option) } } +void load_builtin_commands(const char *prefix, struct cmdnames *cmds) +{ + const char *name; + int i; + + if (!skip_prefix(prefix, "git-", &prefix)) + return; + + for (i = 0; i < ARRAY_SIZE(commands); i++) + if (skip_prefix(commands[i].cmd, prefix, &name)) + add_cmdname(cmds, name, strlen(name)); +} + #ifdef STRIP_EXTENSION static void strip_extension(const char **argv) { diff --git a/help.c b/help.c index 4e2468a44d..919cbb9206 100644 --- a/help.c +++ b/help.c @@ -263,6 +263,8 @@ void load_command_list(const char *prefix, const char *env_path = getenv("PATH"); const char *exec_path = git_exec_path(); + load_builtin_commands(prefix, main_cmds); + if (exec_path) { list_commands_in_dir(main_cmds, exec_path, prefix); QSORT(main_cmds->names, main_cmds->cnt, cmdname_compare); diff --git a/help.h b/help.h index dc02458855..5871e93ba2 100644 --- a/help.h +++ b/help.h @@ -32,6 +32,7 @@ const char *help_unknown_cmd(const char *cmd); void load_command_list(const char *prefix, struct cmdnames *main_cmds, struct cmdnames *other_cmds); +void load_builtin_commands(const char *prefix, struct cmdnames *cmds); void add_cmdname(struct cmdnames *cmds, const char *name, int len); /* Here we require that excludes is a sorted list. */ void exclude_cmds(struct cmdnames *cmds, struct cmdnames *excludes);