mbox series

[0/5] hook API: support stdin, convert post-rewrite

Message ID cover-0.5-00000000000-20230123T170550Z-avarab@gmail.com (mailing list archive)
Headers show
Series hook API: support stdin, convert post-rewrite | expand

Message

Ævar Arnfjörð Bjarmason Jan. 23, 2023, 5:15 p.m. UTC
The greater "config-based-hooks" topic has been stalled for a while.

In the last couple of cycles it was held up by run-command.c API
changes (which will make the rest of this much simpler), and lack of
time on my part.

But let's get the ball rolling again. The last time this was
on-list[1] it was part of a 36 patch series, these 5 patches only
convert one hook (implemented by both sequencer & git-am) to the hook
API, but it's an important step: To do so we need so support reading
from stdin, and passing that to the hooks.

The (passing) CI & topic branch for this is at[2].

The immediate motivation for this is to supply a "stdin" interface
that git-send-email.perl can use, see [3].

Changes since the [1] v5:

 * A new 1/5 here, picked from
   https://lore.kernel.org/git/patch-v2-07.22-b90961ae76d-20221012T084850Z-avarab@gmail.com/;
   A small while-at-it cleanup of a related API.

 * Updates for the aforementioned run-command API changes.

 * Commit message updates & clarifications.

 * The previous version of the "sequencer.c" change changed a variable
   name while at it, now it doesn't, making the diff much easier to
   read.

 * Updates for the *.txt and -h usage, so that we'll pass the
   since-added t0450 test.

1. https://lore.kernel.org/git/cover-v5-00.36-00000000000-20210902T125110Z-avarab@gmail.com/
2. https://github.com/avar/git/tree/es-avar/config-based-hooks-the-beginning
3. https://lore.kernel.org/git/230123.86wn5ds602.gmgdl@evledraar.gmail.com/

Emily Shaffer (4):
  run-command: allow stdin for run_processes_parallel
  hook API: support passing stdin to hooks, convert am's 'post-rewrite'
  sequencer: use the new hook API for the simpler "post-rewrite" call
  hook: support a --to-stdin=<path> option for testing

Ævar Arnfjörð Bjarmason (1):
  run-command.c: remove dead assignment in while-loop

 Documentation/git-hook.txt |  7 ++++++-
 builtin/am.c               | 20 ++++----------------
 builtin/hook.c             |  4 +++-
 hook.c                     |  8 +++++++-
 hook.h                     |  5 +++++
 run-command.c              | 13 +++++++++----
 sequencer.c                | 18 ++++--------------
 t/t1800-hook.sh            | 18 ++++++++++++++++++
 8 files changed, 56 insertions(+), 37 deletions(-)

Range-diff:
 1:  ac419613fdc <  -:  ----------- Makefile: mark "check" target as .PHONY
 2:  a161b7f0a5c <  -:  ----------- Makefile: stop hardcoding {command,config}-list.h
 3:  ffef1d3257e <  -:  ----------- Makefile: remove an out-of-date comment
 4:  545e16c6f04 <  -:  ----------- hook.[ch]: move find_hook() from run-command.c to hook.c
 5:  a9bc4519e9a <  -:  ----------- hook.c: add a hook_exists() wrapper and use it in bugreport.c
 6:  e99ec2e6f8f <  -:  ----------- hook.c users: use "hook_exists()" instead of "find_hook()"
 7:  2ffb2332c8a <  -:  ----------- hook-list.h: add a generated list of hooks, like config-list.h
 8:  72dd1010f5b <  -:  ----------- hook: add 'run' subcommand
 9:  821cc9bf11e <  -:  ----------- gc: use hook library for pre-auto-gc hook
10:  d71c90254ea <  -:  ----------- rebase: convert pre-rebase to use hook.h
11:  ea3af2ccc4d <  -:  ----------- am: convert applypatch to use hook.h
12:  fed0b52f88f <  -:  ----------- hooks: convert 'post-checkout' hook to hook library
13:  53d8721a0e3 <  -:  ----------- merge: convert post-merge to use hook.h
14:  d60827a2856 <  -:  ----------- git hook run: add an --ignore-missing flag
15:  d4976a0821f <  -:  ----------- send-email: use 'git hook run' for 'sendemail-validate'
16:  99f3dcd1945 <  -:  ----------- git-p4: use 'git hook' to run hooks
17:  509761454e6 <  -:  ----------- commit: convert {pre-commit,prepare-commit-msg} hook to hook.h
18:  e2c94d95427 <  -:  ----------- read-cache: convert post-index-change to use hook.h
19:  fa7d0d24ea2 <  -:  ----------- receive-pack: convert push-to-checkout hook to hook.h
20:  428bb5a6792 <  -:  ----------- run-command: remove old run_hook_{le,ve}() hook API
 -:  ----------- >  1:  351c6a55a41 run-command.c: remove dead assignment in while-loop
21:  994f6ad8602 !  2:  81eef2f60a0 run-command: allow stdin for run_processes_parallel
    @@ Commit message
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## run-command.c ##
    -@@ run-command.c: static int pp_start_one(struct parallel_processes *pp)
    - 	if (i == pp->max_processes)
    +@@ run-command.c: static int pp_start_one(struct parallel_processes *pp,
    + 	if (i == opts->processes)
      		BUG("bookkeeping is hard");
      
     +	/*
    @@ run-command.c: static int pp_start_one(struct parallel_processes *pp)
     +	 */
     +	pp->children[i].process.no_stdin = 1;
     +
    - 	code = pp->get_next_task(&pp->children[i].process,
    - 				 &pp->children[i].err,
    - 				 pp->data,
    -@@ run-command.c: static int pp_start_one(struct parallel_processes *pp)
    + 	code = opts->get_next_task(&pp->children[i].process,
    + 				   opts->ungroup ? NULL : &pp->children[i].err,
    + 				   opts->data,
    +@@ run-command.c: static int pp_start_one(struct parallel_processes *pp,
    + 		pp->children[i].process.err = -1;
    + 		pp->children[i].process.stdout_to_stderr = 1;
      	}
    - 	pp->children[i].process.err = -1;
    - 	pp->children[i].process.stdout_to_stderr = 1;
     -	pp->children[i].process.no_stdin = 1;
      
      	if (start_command(&pp->children[i].process)) {
    - 		code = pp->start_failure(&pp->children[i].err,
    + 		if (opts->start_failure)
23:  f548e3d15e7 !  3:  c6b9b69c516 am: convert 'post-rewrite' hook to hook.h
    @@ Metadata
     Author: Emily Shaffer <emilyshaffer@google.com>
     
      ## Commit message ##
    -    am: convert 'post-rewrite' hook to hook.h
    +    hook API: support passing stdin to hooks, convert am's 'post-rewrite'
    +
    +    Convert the invocation of the 'post-rewrite' hook run by 'git am' to
    +    use the hook.h library. To do this we need to add a "path_to_stdin"
    +    member to "struct run_hooks_opt".
    +
    +    In our API this is supported by asking for a file path, rather
    +    than by reading stdin. Reading directly from stdin would involve caching
    +    the entire stdin (to memory or to disk) once the hook API is made to
    +    support "jobs" larger than 1, along with support for executing N hooks
    +    at a time (i.e. the upcoming config-based hooks).
     
         Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    @@ builtin/am.c: static int run_applypatch_msg_hook(struct am_state *state)
     -
     -	if (!hook)
     -		return 0;
    --
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    + 
     -	strvec_push(&cp.args, hook);
     -	strvec_push(&cp.args, "rebase");
    --
    ++	strvec_push(&opt.args, "rebase");
    ++	opt.path_to_stdin = am_path(state, "rewritten");
    + 
     -	cp.in = xopen(am_path(state, "rewritten"), O_RDONLY);
     -	cp.stdout_to_stderr = 1;
     -	cp.trace2_hook_name = "post-rewrite";
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    - 
    +-
     -	ret = run_command(&cp);
    -+	strvec_push(&opt.args, "rebase");
    -+	opt.path_to_stdin = am_path(state, "rewritten");
    - 
    +-
     -	close(cp.in);
     -	return ret;
    -+	return run_hooks_oneshot("post-rewrite", &opt);
    ++	return run_hooks_opt("post-rewrite", &opt);
      }
      
      /**
    +
    + ## hook.c ##
    +@@ hook.c: static int pick_next_hook(struct child_process *cp,
    + 	if (!hook_path)
    + 		return 0;
    + 
    +-	cp->no_stdin = 1;
    + 	strvec_pushv(&cp->env, hook_cb->options->env.v);
    ++	/* reopen the file for stdin; run_command closes it. */
    ++	if (hook_cb->options->path_to_stdin) {
    ++		cp->no_stdin = 0;
    ++		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
    ++	} else {
    ++		cp->no_stdin = 1;
    ++	}
    + 	cp->stdout_to_stderr = 1;
    + 	cp->trace2_hook_name = hook_cb->hook_name;
    + 	cp->dir = hook_cb->options->dir;
    +
    + ## hook.h ##
    +@@ hook.h: struct run_hooks_opt
    + 	 * was invoked.
    + 	 */
    + 	int *invoked_hook;
    ++
    ++	/**
    ++	 * Path to file which should be piped to stdin for each hook.
    ++	 */
    ++	const char *path_to_stdin;
    + };
    + 
    + #define RUN_HOOKS_OPT_INIT { \
 -:  ----------- >  4:  7a55c95f60f sequencer: use the new hook API for the simpler "post-rewrite" call
22:  3ccc654a664 !  5:  cb9ef7a89c4 hook: support passing stdin to hooks
    @@ Metadata
     Author: Emily Shaffer <emilyshaffer@google.com>
     
      ## Commit message ##
    -    hook: support passing stdin to hooks
    +    hook: support a --to-stdin=<path> option for testing
     
    -    Some hooks (such as post-rewrite) need to take input via stdin.
    -    Previously, callers provided stdin to hooks by setting
    -    run-command.h:child_process.in, which takes a FD. Callers would open the
    -    file in question themselves before calling run-command(). However, since
    -    we will now need to seek to the front of the file and read it again for
    -    every hook which runs, hook.h:run_command() takes a path and handles FD
    -    management itself. Since this file is opened for read only, it should
    -    not prevent later parallel execution support.
    -
    -    On the frontend, this is supported by asking for a file path, rather
    -    than by reading stdin. Reading directly from stdin would involve caching
    -    the entire stdin (to memory or to disk) and reading it back from the
    -    beginning to each hook. We'd want to support cases like insufficient
    -    memory or storage for the file. While this may prove useful later, for
    -    now the path of least resistance is to just ask the user to make this
    -    interim file themselves.
    +    Expose the "path_to_stdin" API added in the preceding commit in the
    +    "git hook run" command. For now we won't be using this command
    +    interface outside of the tests, but exposing this functionality makes
    +    it easier to test the hook API.
     
         Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Documentation/git-hook.txt ##
    -@@ Documentation/git-hook.txt: git-hook - run git hooks
    +@@ Documentation/git-hook.txt: git-hook - Run git hooks
      SYNOPSIS
      --------
      [verse]
     -'git hook' run [--ignore-missing] <hook-name> [-- <hook-args>]
    -+'git hook' run [--to-stdin=<path>] [--ignore-missing] <hook-name> [-- <hook-args>]
    ++'git hook' run [--ignore-missing] [--to-stdin=<path>] <hook-name> [-- <hook-args>]
      
      DESCRIPTION
      -----------
    -@@ Documentation/git-hook.txt: what those are.
    +@@ Documentation/git-hook.txt: linkgit:githooks[5] for arguments hooks might expect (if any).
      OPTIONS
      -------
      
    @@ builtin/hook.c
     @@ builtin/hook.c: static int run(int argc, const char **argv, const char *prefix)
      	struct option run_options[] = {
      		OPT_BOOL(0, "ignore-missing", &ignore_missing,
    - 			 N_("exit quietly with a zero exit code if the requested hook cannot be found")),
    + 			 N_("silently ignore missing requested <hook-name>")),
     +		OPT_STRING(0, "to-stdin", &opt.path_to_stdin, N_("path"),
     +			   N_("file to read into hooks' stdin")),
      		OPT_END(),
      	};
      	int ret;
     
    - ## hook.c ##
    -@@ hook.c: static int pick_next_hook(struct child_process *cp,
    - 	if (!run_me)
    - 		return 0;
    - 
    --	cp->no_stdin = 1;
    -+	/* reopen the file for stdin; run_command closes it. */
    -+	if (hook_cb->options->path_to_stdin) {
    -+		cp->no_stdin = 0;
    -+		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
    -+	} else {
    -+		cp->no_stdin = 1;
    -+	}
    - 	cp->env = hook_cb->options->env.v;
    - 	cp->stdout_to_stderr = 1;
    - 	cp->trace2_hook_name = hook_cb->hook_name;
    -
    - ## hook.h ##
    -@@ hook.h: struct run_hooks_opt
    - 
    - 	/* Path to initial working directory for subprocess */
    - 	const char *dir;
    -+
    -+	/* Path to file which should be piped to stdin for each hook */
    -+	const char *path_to_stdin;
    - };
    - 
    - #define RUN_HOOKS_OPT_INIT { \
    -
      ## t/t1800-hook.sh ##
    -@@ t/t1800-hook.sh: test_expect_success 'git -c core.hooksPath=<PATH> hook run' '
    +@@ t/t1800-hook.sh: test_expect_success 'git hook run a hook with a bad shebang' '
      	test_cmp expect actual
      '
      
24:  bb119fa7cc0 <  -:  ----------- run-command: add stdin callback for parallelization
25:  2439f7752b8 <  -:  ----------- hook: provide stdin by string_list or callback
26:  48a380b3a91 <  -:  ----------- hook: convert 'post-rewrite' hook in sequencer.c to hook.h
27:  af6b9292aaa <  -:  ----------- transport: convert pre-push hook to hook.h
28:  957691f0b6d <  -:  ----------- hook tests: test for exact "pre-push" hook input
29:  88fe2621549 <  -:  ----------- hook tests: use a modern style for "pre-push" tests
30:  1d905e81779 <  -:  ----------- reference-transaction: use hook.h to run hooks
31:  fac56a9d8af <  -:  ----------- run-command: allow capturing of collated output
32:  7d185cdf9d1 <  -:  ----------- hooks: allow callers to capture output
33:  c8150e1239f <  -:  ----------- receive-pack: convert 'update' hook to hook.h
34:  a20ad847c14 <  -:  ----------- post-update: use hook.h library
35:  79c380be6ed <  -:  ----------- receive-pack: convert receive hooks to hook.h
36:  fe056098534 <  -:  ----------- hooks: fix a TOCTOU in "did we run a hook?" heuristic

Comments

Ævar Arnfjörð Bjarmason Feb. 8, 2023, 7:21 p.m. UTC | #1
BEGIN UPDATE

I sent this v2 in already as [0]; but didn't fill in the
In-Reply-Header, consequently it was disconnected from the v1 thread,
and per https://lore.kernel.org/git/xmqqr0v7o0pp.fsf@gitster.g/ (and
in "seen") hasn't been picked up. Sorry about that.

END UPDATE

As noted in the v1[1] this is the initial part of the greater
"config-based hooks" topic. I believe this iteration addresses all
comments on v1. Changes since then:

* Remove a couple of paragraphs in 1/4 that aren't relevant anymore,
  an already-landed topic addressed those.

* Don't needlessly change "cp->no_stdin = 1" and introduce an
  "else". This refactoring was there because that code eventually
  changes in the full "config-based hooks" topic, but going through
  those future changes I found that it wasn't for a good reason there
  either. We can just keep the "no_stdin = 1" by default, and have
  specific cases override that.

* Elaborate on why we're not converting the last "post-rewrite" hook
  here.

* Mention the future expected use for sendemail-validate in 5/5

The (passing) CI & topic branch for this is at[2].

0. https://lore.kernel.org/git/cover-v2-0.5-00000000000-20230203T104319Z-avarab@gmail.com/
1. https://lore.kernel.org/git/cover-0.5-00000000000-20230123T170550Z-avarab@gmail.com/
2. https://github.com/avar/git/tree/es-avar/config-based-hooks-the-beginning-2

Emily Shaffer (4):
  run-command: allow stdin for run_processes_parallel
  hook API: support passing stdin to hooks, convert am's 'post-rewrite'
  sequencer: use the new hook API for the simpler "post-rewrite" call
  hook: support a --to-stdin=<path> option

Ævar Arnfjörð Bjarmason (1):
  run-command.c: remove dead assignment in while-loop

 Documentation/git-hook.txt |  7 ++++++-
 builtin/am.c               | 20 ++++----------------
 builtin/hook.c             |  4 +++-
 hook.c                     |  5 +++++
 hook.h                     |  5 +++++
 run-command.c              | 13 +++++++++----
 sequencer.c                | 18 ++++--------------
 t/t1800-hook.sh            | 18 ++++++++++++++++++
 8 files changed, 54 insertions(+), 36 deletions(-)

Range-diff against v1:
1:  351c6a55a41 ! 1:  488b24e1c98 run-command.c: remove dead assignment in while-loop
    @@ Commit message
     
         Remove code that's been unused since it was added in
         c553c72eed6 (run-command: add an asynchronous parallel child
    -    processor, 2015-12-15), the next use of "i" in this function is:
    -
    -            for (i = 0; ...
    -
    -    So we'll always clobber the "i" that's set here. Presumably the "i"
    -    assignment is an artifact of WIP code that made it into our tree.
    -
    -    A subsequent commit will need to adjust the type of the "i" variable
    -    in the otherwise unrelated for-loop, which is why this is being
    -    removed now.
    +    processor, 2015-12-15).
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
2:  81eef2f60a0 = 2:  9a178577dcc run-command: allow stdin for run_processes_parallel
3:  c6b9b69c516 ! 3:  3d3dd6b900a hook API: support passing stdin to hooks, convert am's 'post-rewrite'
    @@ builtin/am.c: static int run_applypatch_msg_hook(struct am_state *state)
     
      ## hook.c ##
     @@ hook.c: static int pick_next_hook(struct child_process *cp,
    - 	if (!hook_path)
    - 		return 0;
      
    --	cp->no_stdin = 1;
    + 	cp->no_stdin = 1;
      	strvec_pushv(&cp->env, hook_cb->options->env.v);
     +	/* reopen the file for stdin; run_command closes it. */
     +	if (hook_cb->options->path_to_stdin) {
     +		cp->no_stdin = 0;
     +		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
    -+	} else {
    -+		cp->no_stdin = 1;
     +	}
      	cp->stdout_to_stderr = 1;
      	cp->trace2_hook_name = hook_cb->hook_name;
4:  7a55c95f60f ! 4:  b96522d593f sequencer: use the new hook API for the simpler "post-rewrite" call
    @@ Commit message
     
         This leaves the more complex "post-rewrite" invocation added in
         a87a6f3c98e (commit: move post-rewrite code to libgit, 2017-11-17)
    -    here in sequencer.c unconverted. That'll be done in a subsequent
    -    commit.
    +    here in sequencer.c unconverted.
    +
    +    Here we can pass in a file's via the "in" file descriptor, in that
    +    case we don't have a file, but will need to write_in_full() to an "in"
    +    provide by the API. Support for that will be added to the hook API in
    +    the future, but we're not there yet.
     
         Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
5:  cb9ef7a89c4 ! 5:  b4e02f41194 hook: support a --to-stdin=<path> option for testing
    @@ Metadata
     Author: Emily Shaffer <emilyshaffer@google.com>
     
      ## Commit message ##
    -    hook: support a --to-stdin=<path> option for testing
    +    hook: support a --to-stdin=<path> option
     
         Expose the "path_to_stdin" API added in the preceding commit in the
    -    "git hook run" command. For now we won't be using this command
    -    interface outside of the tests, but exposing this functionality makes
    -    it easier to test the hook API.
    +    "git hook run" command.
    +
    +    For now we won't be using this command interface outside of the tests,
    +    but exposing this functionality makes it easier to test the hook
    +    API. The plan is to use this to extend the "sendemail-validate"
    +    hook[1][2].
    +
    +    1. https://lore.kernel.org/git/ad152e25-4061-9955-d3e6-a2c8b1bd24e7@amd.com
    +    2. https://lore.kernel.org/git/20230120012459.920932-1-michael.strawbridge@amd.com
     
         Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Junio C Hamano Feb. 8, 2023, 9:23 p.m. UTC | #2
Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> As noted in the v1[1] this is the initial part of the greater
> "config-based hooks" topic. I believe this iteration addresses all
> comments on v1. Changes since then:
>
> * Remove a couple of paragraphs in 1/4 that aren't relevant anymore,
>   an already-landed topic addressed those.
>
> * Don't needlessly change "cp->no_stdin = 1" and introduce an
>   "else". This refactoring was there because that code eventually
>   changes in the full "config-based hooks" topic, but going through
>   those future changes I found that it wasn't for a good reason there
>   either. We can just keep the "no_stdin = 1" by default, and have
>   specific cases override that.
>
> * Elaborate on why we're not converting the last "post-rewrite" hook
>   here.
>
> * Mention the future expected use for sendemail-validate in 5/5

All read well.  Will queue.  Thanks.