[v3,0/6] config-based hooks restarted

Message ID	20210819033450.3382652-1-emilyshaffer@google.com (mailing list archive)
Headers	show Return-Path: <git-owner@kernel.org> Date: Wed, 18 Aug 2021 20:34:44 -0700 In-Reply-To: <20210812004258.74318-1-emilyshaffer@google.com> Message-Id: <20210819033450.3382652-1-emilyshaffer@google.com> Mime-Version: 1.0 Subject: [PATCH v3 0/6] config-based hooks restarted From: Emily Shaffer <emilyshaffer@google.com> To: git@vger.kernel.org Cc: Emily Shaffer <emilyshaffer@google.com>, Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>, Taylor Blau <me@ttaylorr.com>, Felipe Contreras <felipe.contreras@gmail.com>, Eric Sunshine <sunshine@sunshineco.com>, "brian m. carlson" <sandals@crustytoothpaste.net>, Josh Steadmon <steadmon@google.com>, Jonathan Tan <jonathantanmy@google.com>, Derrick Stolee <stolee@gmail.com>, " =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFy?= =?utf-8?b?bWFzb24=?= " <avarab@gmail.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk
Series	config-based hooks restarted \| expand [v3,0/6] config-based hooks restarted [v3,1/6] hook: run a list of hooks instead [v3,2/6] hook: allow parallel hook execution [v3,3/6] hook: introduce "git hook list" [v3,4/6] hook: allow running non-native hooks [v3,5/6] hook: include hooks from the config [v3,6/6] hook: allow out-of-repo 'git hook' invocations

Emily Shaffer Aug. 19, 2021, 3:34 a.m. UTC

This is the config-based hooks topic rebased onto v4 of Ævar's
branch[1]. There is a happy CI build of it on GitHub[2].

The topic overall adds the ability to set up hooks by modifying the
config, in addition to placing specially named hooks into the hookdir.
This enables users to specify multiple hooks for a given event, and so
this topic also fleshes out the use of the run_processes_parallel() API
which is now introduced in Ævar's reordering of prior patches.

Patches 1-4 make some minor changes to prepare Ævar's series to handle
more than one hook at a time. With the exception of patch 4, there
should be no behavior change for existing hooks.

Patch 2 is opinionated about which hooks should and shouldn't be allowed
to run in parallel; if you care about a specific hook, please take a
look there.

Patch 5 is the motivating feature - it begins to parse the config
looking for hooks.

Patch 6 takes advantage of the decoupling of hooks and GITDIR to allow
out-of-repo hook runs, which would only run hooks specified in the
system or global config. This mainly targets 'sendemail-validate', but
'git-sendemail.perl' still explicitly disallows out-of-repo hook
execution on that hook for now. (Maybe that change should be added in
this series? Or maybe patch 6 belongs with that kind of change?)

Since v2, I have addressed comments left by Junio - thanks for the
review. Mostly these were small nits, but two somewhat larger changes
came out:

 - 'hook_list' becomes 'list_hooks' and 'list_hooks_gently'. It might
   still be better to name it something else, and it might still be
   better to decouple the "is this a known hook" check from the "give me
   all the hooks" function; comments welcome.
 - The documentation in patch 5, which starts using the config, has been
   expanded to hopefully become more clear. I am especially interested
   in feedback in this doc change, as it's the main place users should
   be able to learn how to use the new feature.

Everything else should be pretty minor.

Right now I'm trying to focus on this series first and foremost, hence
sending two rerolls based on the same version of Ævar's base restart.
I'll try to perform a code review on Ævar's latest tomorrow.

Thanks!
 - Emily

1: https://lore.kernel.org/git/cover-v4-00.36-00000000000-20210803T191505Z-avarab%40gmail.com
2: https://github.com/nasamuffin/git/actions/runs/1145128560

Emily Shaffer (6):
  hook: run a list of hooks instead
  hook: allow parallel hook execution
  hook: introduce "git hook list"
  hook: allow running non-native hooks
  hook: include hooks from the config
  hook: allow out-of-repo 'git hook' invocations

 Documentation/config/hook.txt |  22 +++
 Documentation/git-hook.txt    |  87 +++++++++-
 builtin/am.c                  |   4 +-
 builtin/checkout.c            |   2 +-
 builtin/clone.c               |   2 +-
 builtin/hook.c                |  65 +++++++-
 builtin/merge.c               |   2 +-
 builtin/rebase.c              |   2 +-
 builtin/receive-pack.c        |   9 +-
 builtin/worktree.c            |   2 +-
 commit.c                      |   2 +-
 git.c                         |   2 +-
 hook.c                        | 293 +++++++++++++++++++++++++++++-----
 hook.h                        |  61 +++++--
 read-cache.c                  |   2 +-
 refs.c                        |   2 +-
 reset.c                       |   3 +-
 sequencer.c                   |   4 +-
 t/t1800-hook.sh               | 161 ++++++++++++++++++-
 transport.c                   |   2 +-
 20 files changed, 648 insertions(+), 81 deletions(-)
 create mode 100644 Documentation/config/hook.txt

Range-diff against v2:
 1:  81fe1ed90d <  -:  ---------- Makefile: mark "check" target as .PHONY
 2:  b32abc81fb <  -:  ---------- Makefile: stop hardcoding {command,config}-list.h
 3:  fa46bd1154 <  -:  ---------- Makefile: remove an out-of-date comment
 4:  715fd63089 <  -:  ---------- hook.[ch]: move find_hook() to this new library
 5:  ac6c018d27 <  -:  ---------- hook.c: add a hook_exists() wrapper and use it in bugreport.c
 6:  53619d87a2 <  -:  ---------- hook.c users: use "hook_exists()" insted of "find_hook()"
 7:  39dbb89620 <  -:  ---------- hook-list.h: add a generated list of hooks, like config-list.h
 8:  7d2c1cec74 <  -:  ---------- hook: add 'run' subcommand
 9:  3fea1cdaf8 <  -:  ---------- gc: use hook library for pre-auto-gc hook
10:  c86a9c6f95 <  -:  ---------- rebase: convert pre-rebase to use hook.h
11:  31dff1f274 <  -:  ---------- am: convert applypatch to use hook.h
12:  a2e27d745c <  -:  ---------- hooks: convert 'post-checkout' hook to hook library
13:  1a71fa8b75 <  -:  ---------- merge: convert post-merge to use hook.h
14:  95e19fa468 <  -:  ---------- git hook run: add an --ignore-missing flag
15:  371f581e3b <  -:  ---------- send-email: use 'git hook run' for 'sendemail-validate'
16:  99306a4bd3 <  -:  ---------- git-p4: use 'git hook' to run hooks
17:  59d1a563ca <  -:  ---------- commit: convert {pre-commit,prepare-commit-msg} hook to hook.h
18:  7c8de04707 <  -:  ---------- read-cache: convert post-index-change to use hook.h
19:  a2e7176609 <  -:  ---------- receive-pack: convert push-to-checkout hook to hook.h
20:  f03ef24df7 <  -:  ---------- run-command: remove old run_hook_{le,ve}() hook API
21:  18c1dfe677 <  -:  ---------- run-command: allow stdin for run_processes_parallel
22:  eff713fc19 <  -:  ---------- hook: support passing stdin to hooks
23:  2cbec65834 <  -:  ---------- am: convert 'post-rewrite' hook to hook.h
24:  5d34d18bee <  -:  ---------- run-command: add stdin callback for parallelization
25:  b989d910ed <  -:  ---------- hook: provide stdin by string_list or callback
26:  ce6929493b <  -:  ---------- hook: convert 'post-rewrite' hook in sequencer.c to hook.h
27:  13affda962 <  -:  ---------- transport: convert pre-push hook to hook.h
28:  c74720029a <  -:  ---------- hook tests: test for exact "pre-push" hook input
29:  1252a251bb <  -:  ---------- hook tests: use a modern style for "pre-push" tests
30:  3f2488fea6 <  -:  ---------- reference-transaction: use hook.h to run hooks
31:  2e78c47ead <  -:  ---------- run-command: allow capturing of collated output
32:  95132acbb0 <  -:  ---------- hooks: allow callers to capture output
33:  f2510ca910 <  -:  ---------- receive-pack: convert 'update' hook to hook.h
34:  8686308864 <  -:  ---------- post-update: use hook.h library
35:  b10f5142a0 <  -:  ---------- receive-pack: convert receive hooks to hook.h
36:  5c29c932c3 <  -:  ---------- hooks: fix a TOCTOU in "did we run a hook?" heuristic
37:  5177e8ba2c !  1:  6d6400329c hook: run a list of hooks instead
    @@ builtin/hook.c: static int run(int argc, const char **argv, const char *prefix)
     -		return run_hooks_oneshot(hook_name, &opt);
     -	hook_path = find_hook(hook_name);
     -	if (!hook_path) {
    -+	hooks = hook_list(hook_name);
    ++	hooks = list_hooks(hook_name);
     +	if (list_empty(hooks)) {
     +		/* ... act like run_hooks_oneshot() under --ignore-missing */
     +		if (ignore_missing)
    @@ hook.c
      
     +static void free_hook(struct hook *ptr)
     +{
    -+	if (ptr) {
    ++	if (ptr)
     +		free(ptr->feed_pipe_cb_data);
    -+	}
     +	free(ptr);
     +}
     +
    @@ hook.c
      static int known_hook(const char *name)
      {
      	const char **p;
    -@@ hook.c: int hook_exists(const char *name)
    - 	return !!find_hook(name);
    - }
    +@@ hook.c: const char *find_hook(const char *name)
      
    -+struct list_head* hook_list(const char* hookname)
    + int hook_exists(const char *name)
    + {
    +-	return !!find_hook(name);
    ++	return !list_empty(list_hooks(name));
    ++}
    ++
    ++struct list_head *list_hooks(const char *hookname)
     +{
     +	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
     +
     +	INIT_LIST_HEAD(hook_head);
     +
     +	if (!hookname)
    -+		return NULL;
    ++		BUG("null hookname was provided to hook_list()!");
     +
     +	if (have_git_dir()) {
     +		const char *hook_path = find_hook(hookname);
    @@ hook.c: int hook_exists(const char *name)
     +	}
     +
     +	return hook_head;
    -+}
    -+
    + }
    + 
      void run_hooks_opt_clear(struct run_hooks_opt *o)
    - {
    - 	strvec_clear(&o->env);
     @@ hook.c: static int pick_next_hook(struct child_process *cp,
      	cp->dir = hook_cb->options->dir;
      
    @@ hook.c: int run_hooks_oneshot(const char *hook_name, struct run_hooks_opt *optio
     -	hook_path = find_hook(hook_name);
     -	if (!hook_path) {
     -		ret = 0;
    -+	hooks = hook_list(hook_name);
    ++	hooks = list_hooks(hook_name);
     +
     +	/*
     +	 * If you need to act on a missing hook, use run_found_hooks()
    @@ hook.h: struct hook {
     + * Provides a linked list of 'struct hook' detailing commands which should run
     + * in response to the 'hookname' event, in execution order.
     + */
    -+struct list_head* hook_list(const char *hookname);
    ++struct list_head *list_hooks(const char *hookname);
     +
      struct run_hooks_opt
      {
38:  eda439cd57 !  2:  dfb995ce4d hook: allow parallel hook execution
    @@ Commit message
         hook: allow parallel hook execution
     
         In many cases, there's no reason not to allow hooks to execute in
    -    parallel. run_processes_parallel() is well-suited - it's a task queue
    -    that runs its housekeeping in series, which means users don't
    -    need to worry about thread safety on their callback data. True
    -    multithreaded execution with the async_* functions isn't necessary here.
    -    Synchronous hook execution can be achieved by only allowing 1 job to run
    -    at a time.
    +    parallel, if more than one was provided. hook.c already calls
    +    run_processes_parallel(), so all we need to do is allow the job count we
    +    hand to run_processes_parallel() to be greater than 1.
     
    -    Teach run_hooks() to use that function for simple hooks which don't
    -    require stdin or capture of stderr.
    +    If users have specified no alternative, we can use the processor count
    +    from online_cpus() to run an efficient number of tasks at once. However,
    +    users can also override this number if they want by configuring
    +    'hook.jobs'.
    +
    +    To avoid looking up 'hook.jobs' in cases where we don't end up with any
    +    hooks to run anyways, teach the hook runner commands to notice if
    +    .jobs==0 and do a config or online_cpus() lookup if so, when we already
    +    know we have jobs to run.
    +
    +    Serial execution can still be performed by setting .jobs == 1.
     
         Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
         Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    @@ builtin/am.c: static void am_destroy(const struct am_state *state)
      {
      	int ret;
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SERIAL;
      
      	assert(state->msg);
      	strvec_push(&opt.args, am_path(state, "final-commit"));
    @@ builtin/am.c: static int run_applypatch_msg_hook(struct am_state *state)
      static int run_post_rewrite_hook(const struct am_state *state)
      {
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      
      	strvec_push(&opt.args, "rebase");
      	opt.path_to_stdin = am_path(state, "rewritten");
    @@ builtin/checkout.c: struct branch_info {
      			      int changed)
      {
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SERIAL;
      
      	/* "new_commit" can be NULL when checking out from the index before
      	   a commit exists. */
    @@ builtin/clone.c: static int checkout(int submodule_progress)
      	struct tree_desc t;
      	int err = 0;
     -	struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT_SYNC;
    ++	struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT_SERIAL;
      
      	if (option_no_checkout)
      		return 0;
    @@ builtin/hook.c: static const char * const builtin_hook_run_usage[] = {
      {
      	int i;
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SERIAL;
      	int ignore_missing = 0;
      	const char *hook_name;
      	struct list_head *hooks;
    @@ builtin/merge.c: static void finish(struct commit *head_commit,
      {
      	struct strbuf reflog_message = STRBUF_INIT;
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      	const struct object_id *head = &head_commit->object.oid;
      
      	if (!msg)
    @@ builtin/rebase.c: int cmd_rebase(int argc, const char **argv, const char *prefix
      	int reschedule_failed_exec = -1;
      	int allow_preemptive_ff = 1;
     -	struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      	struct option builtin_rebase_options[] = {
      		OPT_STRING(0, "onto", &options.onto_name,
      			   N_("revision"),
    @@ builtin/receive-pack.c: static int run_receive_hook(struct command *commands,
      			    const struct string_list *push_options)
      {
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      	struct receive_hook_feed_context ctx;
      	struct command *iter = commands;
      
    @@ builtin/receive-pack.c: static int run_receive_hook(struct command *commands,
      static int run_update_hook(struct command *cmd)
      {
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      
      	strvec_pushl(&opt.args,
      		     cmd->ref_name,
    @@ builtin/receive-pack.c: static const char *push_to_checkout(unsigned char *hash,
      				    const char *work_tree)
      {
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SERIAL;
     +
      	opt.invoked_hook = invoked_hook;
      
    @@ builtin/receive-pack.c: static const char *update(struct command *cmd, struct sh
      {
      	struct command *cmd;
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      
      	for (cmd = commands; cmd; cmd = cmd->next) {
      		if (cmd->error_string || cmd->did_not_exist)
    @@ builtin/worktree.c: static int add_worktree(const char *path, const char *refnam
      	 */
      	if (!ret && opts->checkout) {
     -		struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+		struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SYNC;
    ++		struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SERIAL;
      
      		strvec_pushl(&opt.env, "GIT_DIR", "GIT_WORK_TREE", NULL);
      		strvec_pushl(&opt.args,
    @@ commit.c: int run_commit_hook(int editor_is_used, const char *index_file,
      		    const char *name, ...)
      {
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_SERIAL;
      	va_list args;
      	const char *arg;
      
    @@ hook.c: int run_hooks(const char *hook_name, struct list_head *hooks,
      	cb_data.run_me = list_first_entry(hooks, struct hook, list);
      
     -	run_processes_parallel_tr2(jobs,
    -+	/* INIT_ASYNC sets jobs to 0, so go look up how many to use. */
    ++	/* INIT_PARALLEL sets jobs to 0, so go look up how many to use. */
     +	if (!options->jobs)
     +		options->jobs = configured_hook_jobs();
     +
    @@ hook.c: int run_hooks_oneshot(const char *hook_name, struct run_hooks_opt *optio
     +	 * Turn on parallelism by default. Hooks which don't want it should
     +	 * specify their options accordingly.
     +	 */
    -+	struct run_hooks_opt hook_opt_scratch = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt hook_opt_scratch = RUN_HOOKS_OPT_INIT_PARALLEL;
      
      	if (!options)
      		options = &hook_opt_scratch;
    @@ hook.h: struct hook_cb_data {
      	int *invoked_hook;
      };
      
    -+#define RUN_HOOKS_OPT_INIT_SYNC { \
    ++#define RUN_HOOKS_OPT_INIT_SERIAL { \
     +	.jobs = 1, \
     +	.env = STRVEC_INIT, \
     +	.args = STRVEC_INIT, \
     +}
     +
    -+#define RUN_HOOKS_OPT_INIT_ASYNC { \
    ++#define RUN_HOOKS_OPT_INIT_PARALLEL { \
     +	.jobs = 0, \
     +	.env = STRVEC_INIT, \
     +	.args = STRVEC_INIT, \
    @@ read-cache.c: static int do_write_locked_index(struct index_state *istate, struc
      	int ret;
      	int was_full = !istate->sparse_index;
     -	struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      
      	ret = convert_to_sparse(istate);
      
    @@ refs.c: int ref_update_reject_duplicates(struct string_list *refnames,
      				const char *state)
      {
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      	struct string_list to_stdin = STRING_LIST_INIT_NODUP;
      	int ret = 0, i;
      
    @@ reset.c: int reset_head(struct repository *r, struct object_id *oid, const char
      	}
      	if (run_hook) {
     -		struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+		struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++		struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_PARALLEL;
     +
      		strvec_pushl(&opt.args,
      			     oid_to_hex(orig ? orig : null_oid()),
    @@ sequencer.c: int update_head_with_reflog(const struct commit *old_head,
      			    const struct object_id *newoid)
      {
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      	struct strbuf tmp = STRBUF_INIT;
      	struct string_list to_stdin = STRING_LIST_INIT_DUP;
      	int code;
    @@ sequencer.c: static int pick_commits(struct repository *r,
      				st.st_size > 0) {
      			struct child_process notes_cp = CHILD_PROCESS_INIT;
     -			struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT;
    -+			struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++			struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      
      			notes_cp.in = open(rebase_path_rewritten_list(), O_RDONLY);
      			notes_cp.git_cmd = 1;
    @@ transport.c: static int run_pre_push_hook(struct transport *transport,
      {
      	int ret = 0;
     -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
    -+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
    ++	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_PARALLEL;
      	struct ref *r;
      	struct string_list to_stdin = STRING_LIST_INIT_NODUP;
      
39:  cdfe3b6e16 !  3:  c8a04306e9 hook: introduce "git hook list"
    @@ Commit message
     
         Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
     
    + ## Documentation/git-hook.txt ##
    +@@ Documentation/git-hook.txt: SYNOPSIS
    + [verse]
    + 'git hook' run [--to-stdin=<path>] [--ignore-missing] [(-j|--jobs) <n>]
    + 	<hook-name> [-- <hook-args>]
    ++'git hook' list <hook-name>
    + 
    + DESCRIPTION
    + -----------
    +@@ Documentation/git-hook.txt: optional `--` (or `--end-of-options`, see linkgit:gitcli[7]). The
    + arguments (if any) differ by hook name, see linkgit:githooks[5] for
    + what those are.
    + 
    ++list::
    ++	Print a list of hooks which will be run on `<hook-name>` event. If no
    ++	hooks are configured for that event, print nothing and return 1.
    ++
    + OPTIONS
    + -------
    + 
    +
      ## builtin/hook.c ##
     @@
      
      #define BUILTIN_HOOK_RUN_USAGE \
    - 	N_("git hook run [--to-stdin=<path>] <hook-name> [-- <hook-args>]")
    + 	N_("git hook run [--ignore-missing] [--to-stdin=<path>] <hook-name> [-- <hook-args>]")
     +#define BUILTIN_HOOK_LIST_USAGE \
     +	N_("git hook list <hook-name>")
      
    @@ builtin/hook.c: static const char * const builtin_hook_run_usage[] = {
     +
     +	head = hook_list(hookname);
     +
    -+	if (list_empty(head)) {
    -+		printf(_("no commands configured for hook '%s'\n"),
    -+		       hookname);
    -+		return 0;
    -+	}
    ++	if (list_empty(head))
    ++		return 1;
     +
     +	list_for_each(pos, head) {
     +		struct hook *item = list_entry(pos, struct hook, list);
    @@ builtin/hook.c: int cmd_hook(int argc, const char **argv, const char *prefix)
      
     
      ## hook.c ##
    -@@ hook.c: int hook_exists(const char *name)
    - struct list_head* hook_list(const char* hookname)
    +@@ hook.c: struct list_head *list_hooks(const char *hookname)
      {
      	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
    -+	const char *hook_path = find_hook(hookname);
    -+
      
    ++
      	INIT_LIST_HEAD(hook_head);
      
      	if (!hookname)
    - 		return NULL;
    +@@ hook.c: struct list_head *list_hooks(const char *hookname)
      
    --	if (have_git_dir()) {
    --		const char *hook_path = find_hook(hookname);
    + 	if (have_git_dir()) {
    + 		const char *hook_path = find_hook(hookname);
     -
     -		/* Add the hook from the hookdir */
    --		if (hook_path) {
    --			struct hook *to_add = xmalloc(sizeof(*to_add));
    --			to_add->hook_path = hook_path;
    --			to_add->feed_pipe_cb_data = NULL;
    --			list_add_tail(&to_add->list, hook_head);
    --		}
    -+	/* Add the hook from the hookdir */
    -+	if (hook_path) {
    -+		struct hook *to_add = xmalloc(sizeof(*to_add));
    -+		to_add->hook_path = hook_path;
    -+		to_add->feed_pipe_cb_data = NULL;
    -+		list_add_tail(&to_add->list, hook_head);
    - 	}
    - 
    - 	return hook_head;
    + 		if (hook_path) {
    + 			struct hook *to_add = xmalloc(sizeof(*to_add));
    + 			to_add->hook_path = hook_path;
40:  eb4e03e22b !  4:  af14116d0f hook: allow running non-native hooks
    @@ builtin/hook.c: static int list(int argc, const char **argv, const char *prefix)
      	hookname = argv[0];
      
     -	head = hook_list(hookname);
    -+	head = hook_list(hookname, 1);
    ++	head = list_hooks_gently(hookname);
      
    - 	if (list_empty(head)) {
    - 		printf(_("no commands configured for hook '%s'\n"),
    + 	if (list_empty(head))
    + 		return 1;
     @@ builtin/hook.c: static int run(int argc, const char **argv, const char *prefix)
      	git_config(git_default_config, NULL);
      
      	hook_name = argv[0];
    --	hooks = hook_list(hook_name);
    -+	hooks = hook_list(hook_name, 1);
    +-	hooks = list_hooks(hook_name);
    ++	hooks = list_hooks_gently(hook_name);
      	if (list_empty(hooks)) {
      		/* ... act like run_hooks_oneshot() under --ignore-missing */
      		if (ignore_missing)
    @@ hook.c: static int known_hook(const char *name)
     +	const char *hook_path;
      
      	if (!known_hook(name))
    - 		die(_("the hook '%s' is not known to git, should be in hook-list.h via githooks(5)"),
    +-		die(_("the hook '%s' is not known to git, should be in hook-list.h via githooks(5)"),
    ++		BUG(_("the hook '%s' is not known to git, should be in hook-list.h via githooks(5)"),
      		    name);
      
     +	hook_path = find_hook_gently(name);
    @@ hook.c: static int known_hook(const char *name)
      	strbuf_git_path(&path, "hooks/%s", name);
      	if (access(path.buf, X_OK) < 0) {
     @@ hook.c: int hook_exists(const char *name)
    - 	return !!find_hook(name);
    + 	return !list_empty(list_hooks(name));
      }
      
    --struct list_head* hook_list(const char* hookname)
     +struct hook_config_cb
     +{
     +	struct strbuf *hook_key;
     +	struct list_head *list;
     +};
     +
    -+struct list_head* hook_list(const char* hookname, int allow_unknown)
    + struct list_head *list_hooks(const char *hookname)
      {
    - 	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
    --	const char *hook_path = find_hook(hookname);
    -+	const char *hook_path;
    +-	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
    ++	if (!known_hook(hookname))
    ++		BUG("Don't recognize hook event '%s'! "
    ++		    "Is it documented in Documentation/githooks.txt?",
    ++		    hookname);
    ++	return list_hooks_gently(hookname);
    ++}
      
    ++struct list_head *list_hooks_gently(const char *hookname)
    ++{
    ++	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
      
      	INIT_LIST_HEAD(hook_head);
    -@@ hook.c: struct list_head* hook_list(const char* hookname)
    - 	if (!hookname)
    - 		return NULL;
    - 
    -+	if (allow_unknown)
    -+		hook_path = find_hook_gently(hookname);
    -+	else
    -+		hook_path = find_hook(hookname);
    -+
    - 	/* Add the hook from the hookdir */
    - 	if (hook_path) {
    - 		struct hook *to_add = xmalloc(sizeof(*to_add));
    + 
    +@@ hook.c: struct list_head *list_hooks(const char *hookname)
    + 		BUG("null hookname was provided to hook_list()!");
    + 
    + 	if (have_git_dir()) {
    +-		const char *hook_path = find_hook(hookname);
    ++		const char *hook_path = find_hook_gently(hookname);
    + 		if (hook_path) {
    + 			struct hook *to_add = xmalloc(sizeof(*to_add));
    + 			to_add->hook_path = hook_path;
     @@ hook.c: int run_hooks_oneshot(const char *hook_name, struct run_hooks_opt *options)
      	if (options->path_to_stdin && options->feed_pipe)
      		BUG("choose only one method to populate stdin");
      
    --	hooks = hook_list(hook_name);
     +	/*
    -+	 * 'git hooks run <hookname>' uses run_found_hooks, so we don't need to
    ++	 * 'git hooks run <hookname>' uses run_hooks, so we don't need to
     +	 * allow unknown hooknames here.
     +	 */
    -+	hooks = hook_list(hook_name, 0);
    + 	hooks = list_hooks(hook_name);
      
      	/*
    - 	 * If you need to act on a missing hook, use run_found_hooks()
     
      ## hook.h ##
     @@
    @@ hook.h
       * overwritten by further calls to find_hook and run_hook_*.
     + *
     + * If the hook is not a native hook (e.g. present in Documentation/githooks.txt)
    -+ * find_hook() will die(). find_hook_gently() does not consult the native hook
    ++ * find_hook() will BUG(). find_hook_gently() does not consult the native hook
     + * list and will check for any hook name in the hooks directory regardless of
     + * whether it is known. find_hook() should be used by internal calls to hooks,
     + * and find_hook_gently() should only be used when the hookname was provided by
    @@ hook.h: struct hook {
       * Provides a linked list of 'struct hook' detailing commands which should run
       * in response to the 'hookname' event, in execution order.
     + *
    -+ * If allow_unknown is unset, hooks will be checked against the hook list
    -+ * generated from Documentation/githooks.txt. Otherwise, any hook name will be
    -+ * allowed. allow_unknown should only be set when the hook name is provided by
    -+ * the user; internal calls to hook_list should make sure the hook they are
    -+ * invoking is present in Documentation/githooks.txt.
    ++ * list_hooks() checks the provided hookname against Documentation/githooks.txt
    ++ * and BUG()s if it is not found.  list_hooks_gently() allows any hookname. The
    ++ * latter should only be used when the hook name is provided by the user, and
    ++ * the former should be used by internal callers.
       */
    --struct list_head* hook_list(const char *hookname);
    -+struct list_head* hook_list(const char *hookname, int allow_unknown);
    + struct list_head *list_hooks(const char *hookname);
    ++struct list_head *list_hooks_gently(const char *hookname);
      
      struct run_hooks_opt
      {
41:  2c8e874158 !  5:  2bbb179962 hook: include hooks from the config
    @@ Commit message
     
      ## Documentation/config/hook.txt ##
     @@
    -+hook.<command>.command::
    -+	A command to execute during the <command> hook event. This can be an
    -+	executable on your device, a oneliner for your shell, or the name of a
    -+	hookcmd. See linkgit:git-hook[1].
    ++hook.<name>.command::
    ++	A command to execute whenever `hook.<name>` is invoked. `<name>` should
    ++	be a unique "friendly" name which you can use to identify this hook
    ++	command. (You can specify when to invoke this command with
    ++	`hook.<name>.event`.) The value can be an executable on your device or a
    ++	oneliner for your shell. If more than one value is specified for the
    ++	same `<name>`, the last value parsed will be the only command executed.
    ++	See linkgit:git-hook[1].
    ++
    ++hook.<name>.event::
    ++	The hook events which should invoke `hook.<name>`. `<name>` should be a
    ++	unique "friendly" name which you can use to identify this hook. The
    ++	value should be the name of a hook event, like "pre-commit" or "update".
    ++	(See linkgit:githooks[5] for a complete list of hooks Git knows about.)
    ++	On the specified event, the associated `hook.<name>.command` will be
    ++	executed. More than one event can be specified if you wish for
    ++	`hook.<name>` to execute on multiple events. See linkgit:git-hook[1].
     +
      hook.jobs::
      	Specifies how many hooks can be run simultaneously during parallelized
    @@ Documentation/git-hook.txt: Git is unlikely to use for a native hook later on. F
     +    event = pre-commit
     +    command = ~/bin/linter --c
     +
    -+Conmmands are run in the order Git encounters their associated
    ++In this example, `[hook "linter"]` represents one script - `~/bin/linter --c` -
    ++which can be shared by many repos, and even by many hook events, if appropriate.
    ++
    ++Commands are run in the order Git encounters their associated
     +`hook.<name>.event` configs during the configuration parse (see
    -+linkgit:git-config[1]).
    ++linkgit:git-config[1]). Although multiple `hook.linter.event` configs can be
    ++added, only one `hook.linter.command` event is valid - Git uses "last-one-wins"
    ++to determine which command to run.
    ++
    ++So if you wanted your linter to run when you commit as well as when you push,
    ++you would configure it like so:
    ++
    ++  [hook "linter"]
    ++    event = pre-commit
    ++    event = pre-push
    ++    command = ~/bin/linter --c
    ++
    ++With this config, `~/bin/linter --c` would be run by Git before a commit is
    ++generated (during `pre-commit`) as well as before a push is performed (during
    ++`pre-push`).
    ++
    ++And if you wanted to run your linter as well as a secret-leak detector during
    ++only the "pre-commit" hook event, you would configure it instead like so:
    ++
    ++  [hook "linter"]
    ++    event = pre-commit
    ++    command = ~/bin/linter --c
    ++  [hook "no-leaks"]
    ++    event = pre-commit
    ++    command = ~/bin/leak-detector
    ++
    ++With this config, before a commit is generated (during `pre-commit`), Git would
    ++first start `~/bin/linter --c` and second start `~/bin/leak-detector`. It would
    ++evaluate the output of each when deciding whether to proceed with the commit.
    ++
    ++For a full list of hook events which you can set your `hook.<name>.event` to,
    ++and how hooks are invoked during those events, see linkgit:githooks[5].
     +
     +In general, when instructions suggest adding a script to
     +`.git/hooks/<hook-event>`, you can specify it in the config instead by running
    @@ Documentation/git-hook.txt: Git is unlikely to use for a native hook later on. F
      optional `--` (or `--end-of-options`, see linkgit:gitcli[7]). The
     
      ## builtin/hook.c ##
    -@@ builtin/hook.c: static int list(int argc, const char **argv, const char *prefix)
    - 	head = hook_list(hookname, 1);
    - 
    - 	if (list_empty(head)) {
    --		printf(_("no commands configured for hook '%s'\n"),
    -+		printf(_("no hooks configured for event '%s'\n"),
    - 		       hookname);
    - 		return 0;
    - 	}
     @@ builtin/hook.c: static int list(int argc, const char **argv, const char *prefix)
      		struct hook *item = list_entry(pos, struct hook, list);
      		item = list_entry(pos, struct hook, list);
    @@ hook.c: static void free_hook(struct hook *ptr)
      static void remove_hook(struct list_head *to_remove)
      {
      	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
    -@@ hook.c: const char *find_hook_gently(const char *name)
    - 
    - int hook_exists(const char *name)
    - {
    --	return !!find_hook(name);
    -+	return !list_empty(hook_list(name, 0));
    - }
    +@@ hook.c: int hook_exists(const char *name)
      
      struct hook_config_cb
      {
    @@ hook.c: const char *find_hook_gently(const char *name)
      	struct list_head *list;
      };
      
    --struct list_head* hook_list(const char* hookname, int allow_unknown)
     +/*
     + * Callback for git_config which adds configured hooks to a hook list.  Hooks
     + * can be configured by specifying both hook.<friend-name>.command = <path> and
    @@ hook.c: const char *find_hook_gently(const char *name)
     +	return 0;
     +}
     +
    -+struct list_head* hook_list(const char *hookname, int allow_unknown)
    + struct list_head *list_hooks(const char *hookname)
    + {
    + 	if (!known_hook(hookname))
    +@@ hook.c: struct list_head *list_hooks(const char *hookname)
    + struct list_head *list_hooks_gently(const char *hookname)
      {
      	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
    --	const char *hook_path;
     +	struct hook_config_cb cb_data = {
     +		.hook_event = hookname,
     +		.list = hook_head,
     +	};
      
    -+	if (!allow_unknown && !known_hook(hookname))
    -+		die(_("Don't recognize hook event '%s'. "
    -+		      "Is it documented in 'githooks.txt'?"),
    -+		      hookname);
    - 
      	INIT_LIST_HEAD(hook_head);
      
      	if (!hookname)
    - 		return NULL;
    + 		BUG("null hookname was provided to hook_list()!");
      
    --	if (allow_unknown)
    --		hook_path = find_hook_gently(hookname);
    --	else
    --		hook_path = find_hook(hookname);
    +-	if (have_git_dir()) {
    +-		const char *hook_path = find_hook_gently(hookname);
    +-		if (hook_path) {
    +-			struct hook *to_add = xmalloc(sizeof(*to_add));
    +-			to_add->hook_path = hook_path;
    +-			to_add->feed_pipe_cb_data = NULL;
    +-			list_add_tail(&to_add->list, hook_head);
    +-		}
    +-	}
     +	/* Add the hooks from the config, e.g. hook.myhook.event = pre-commit */
     +	git_config(hook_config_lookup, &cb_data);
    - 
    --	/* Add the hook from the hookdir */
    --	if (hook_path) {
    --		struct hook *to_add = xmalloc(sizeof(*to_add));
    --		to_add->hook_path = hook_path;
    --		to_add->feed_pipe_cb_data = NULL;
    --		list_add_tail(&to_add->list, hook_head);
    --	}
    ++
     +	/* Add the hook from the hookdir. The placeholder makes it easier to
     +	 * allocate work in pick_next_hook. */
     +	if (find_hook_gently(hookname))
    @@ hook.c: static int notify_start_failure(struct strbuf *out,
      
      	return 1;
      }
    -@@ hook.c: int run_hooks_oneshot(const char *hook_name, struct run_hooks_opt *options)
    - 		BUG("choose only one method to populate stdin");
    - 
    - 	/*
    --	 * 'git hooks run <hookname>' uses run_found_hooks, so we don't need to
    -+	 * 'git hooks run <hookname>' uses run_found_hooks, and we want to make
    -+	 * sure internal callers are using known hooks, so we don't need to
    - 	 * allow unknown hooknames here.
    - 	 */
    - 	hooks = hook_list(hook_name, 0);
     
      ## hook.h ##
     @@ hook.h: int hook_exists(const char *hookname);
42:  3216e51b6b !  6:  30ffe98601 hook: allow out-of-repo 'git hook' invocations
    @@ git.c: static struct cmd_struct commands[] = {
      	{ "init-db", cmd_init_db },
     
      ## hook.c ##
    -@@ hook.c: struct list_head* hook_list(const char *hookname, int allow_unknown)
    +@@ hook.c: struct list_head *list_hooks_gently(const char *hookname)
      
      	/* Add the hook from the hookdir. The placeholder makes it easier to
      	 * allocate work in pick_next_hook. */

Ævar Arnfjörð Bjarmason Aug. 24, 2021, 8:29 p.m. UTC | #1

On Wed, Aug 18 2021, Emily Shaffer wrote:

> This is the config-based hooks topic rebased onto v4 of Ævar's
> branch[1]. There is a happy CI build of it on GitHub[2].
> [...]
> Right now I'm trying to focus on this series first and foremost, hence
> sending two rerolls based on the same version of Ævar's base restart.
> I'll try to perform a code review on Ævar's latest tomorrow.
> [...]

First thank for the review on the base topic at
https://lore.kernel.org/git/YR2jLdYQA5CVzX5h@google.com/ &
https://lore.kernel.org/git/YR7r3h1AG4Zyn7x7@google.com/ and related.

I'll re-roll it soon pending your feedback on comments I left on this
series that pertain to it. I.e. there's some things that I think are
better just fixed up in the base topic (from trivial things like
whitespace changes, to some behavior changes), makes for easier reading
rather than going back & forth between the two.

I've read this whole series through, as promised in
https://lore.kernel.org/git/87lf4qeh86.fsf@evledraar.gmail.com/:

    [I'll discuss my opinions on the new and revised config schema in
    another mail, just commenting on the patch here in terms of its stated
    goals].

I.e. here's some general comments, numbered for ease of reference:

0)

I think this is in much much better shape vis-as-vis the simplified
config schema that's now being proposed re our discussion starting
around https://lore.kernel.org/git/87bl6ttmlv.fsf@evledraar.gmail.com/

I.e. the main complexity of the "skip" mechanism is gone, and also the
conflation of hook names with hook commands (the "rm -rf /" as a
<name-or-cmd> discussed in the above).

So before going any further, I'll just say that I wouldn't object much
to this design going in as-is. What I'm about to mention here below is
much closer to bikeshedding in my mind than "this is really to complex
to go in-tree", which was my opinion of the config schema before.

1)

On the current config design: First, before going into any detail on
that, I think whatever anyone's opinion is on that that the
design-focused patches as they stand could really use more a more
extended discussion of the design.

I.e. talk about the previously considered config schema, why it evolved
into its current form. The trade-offs involved, and why the patch
proposed to implement the schema it's implementing over another earlier
or alternate design.

I.e. https://lore.kernel.org/git/20210819033450.3382652-6-emilyshaffer@google.com
is two very short paragraphs. We won't be able to summarize all our
month-long discussion on the config design in one commit message, but I
think at least discussing it somewhat / linking to relevant on-list
discussions would make future source spelunking easier.

2)

So that out of the way, a comment on the current config design, which
should be read in the context of what I noted in #0. I.e. I'm *much*
happier with this version.

That being said I'm still not convinced that the simple 1=1 mapping of
"hook.<name>.command" and its "value" should be followed by the 1=many
mapping of "hook.<name>.event" and its "value".

I.e. we're making the trade-off of saving the user from typing out or
specifying:

    [hook "my-pre-commit"]
    command = ~/hooks/pre-commit-or-push
    event = pre-commit
    [hook "my-pre-push"]
    command = ~/hooks/pre-commit-or-push
    event = pre-push

And instead being able to do:

    [hook "my-pre-commit-or-push"]
    command = ~/hooks/pre-commit-or-push
    event = pre-commit
    event = pre-push

So for the very common case, saving two config lines. "Two" because as
we discussed[1] as there's currently no GIT_HOOK_TYPE env var. So this
form will work pretty much only for that case.

I.e. unlike with .git/hook/<name> the hook run via config can't
determine what <hook-type> it's being run at, so as it stands this is
only useful for those hooks listed in githooks(5) where someone would
want to do the exact same thing for one or more <hook-name> names. You
can't use it as a general routing mechanism for any hook type as it
stands.

I *think* that's only these two, perhaps "update" and "pre-receive",
with the hook seeing if it consume stdin/has arguments to disambiguate
the two.

But even with a GIT_HOOK_TYPE passed the trade-off, as discussed in [1],
and downthread in [2], is that by having it 1=many we're closing the
door on any future hook.<name>.<whatever>. I.e. config that would change
the behavior of that hook, but you'd want to change it in another way
for at least one of the N event types.

Well, "closing the door" as in if you'd want that you'd have to split up
the section from the "my-pre-commit-or-push" example above to the
"my-pre-commit" and "my-pre-push" example.

But again, on the "is the complexity worth it" we're then having to
explain to users that they can do it one way if the want no config other
than hook.<name>.{command,event*}, but another if they have another key
in that namespace.

You've said that you wanted to add something like a GIT_HOOK_TYPE
environment variable. Fair enough, and I guess we could add it in a
re-roll of this series. I'm mainly commenting on the end-state of *this*
series in particular. I.e. I think it leaves the user & implementation
with a config schema that still seems to be needlessly complex for the
very limited benefits that complexity brings us in what you're able to
do with it now.

But some of that goes back to the comments I had on 5/6[3], i.e. I'm
willing to be convinced, but I think that the current commit message &
added docs aren't really selling the idea of why it's worth it.

3) 

As an extension to my comments on 5/6[3], I think this whole notion of
"git hook run <name>" as invoked by a user of git is just more confusing
the more I think about it.

I.e. 5/6[4] is apparently seeking to implement a way to just make that
facility a general way for users to run some command on their system to
do whatever, instead of say using /usr/bin/parallel or a shell alias.

But then we also use that command as our own dispatch mechanism for our
own known hooks, except mostly not, since we mostly use the C API
ourselves directly.

It's particularly confusing that if you say run "git hook run
pre-auto-gc" as a user to test your hook you'll have that hook run in
the same way that git-gc(1) would run it. So someone developing a hook
might think they can use "git hook run" for testing it.

But if you do the same with say "git hook run pre-receive" or anything
else that feeds arguments or stdin (e.g. "update", or "pre-receive"),
you'll have your hook happily being run by git, but in a way that's not
at all how such a hook will be run when it's run by git "for real".

So I wonder if we shouldn't just have the thing die() if you try to run
any hook that's in githooks(5) itself, except for sendemail-validate and
the p4 hooks, since we need to run those ourselves.

Or have those use an internal-only "git hook--helper", and start out
with "git hook" just supporting "git hook list", and then later on have
"git hook run" (or perhaps "git hook run-configured"?) be an entry point
for this facility of running some arbitrary script that's not a "real"
hook.

I don't know, maybe I'm the only one that finds this confusing...

1. https://lore.kernel.org/git/87bl6ttmlv.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/877dh0n1b3.fsf@evledraar.gmail.com/
3. https://lore.kernel.org/git/87lf4qeh86.fsf@evledraar.gmail.com/
4. https://lore.kernel.org/git/20210819033450.3382652-6-emilyshaffer@google.com/

Emily Shaffer Sept. 2, 2021, 10:01 p.m. UTC | #2

On Tue, Aug 24, 2021 at 10:29:33PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> 
> On Wed, Aug 18 2021, Emily Shaffer wrote:
> 0)
> 
> I think this is in much much better shape vis-as-vis the simplified
> config schema that's now being proposed re our discussion starting
> around https://lore.kernel.org/git/87bl6ttmlv.fsf@evledraar.gmail.com/
> 
> I.e. the main complexity of the "skip" mechanism is gone, and also the
> conflation of hook names with hook commands (the "rm -rf /" as a
> <name-or-cmd> discussed in the above).
> 
> So before going any further, I'll just say that I wouldn't object much
> to this design going in as-is. What I'm about to mention here below is
> much closer to bikeshedding in my mind than "this is really to complex
> to go in-tree", which was my opinion of the config schema before.
> 
> 1)
> 
> On the current config design: First, before going into any detail on
> that, I think whatever anyone's opinion is on that that the
> design-focused patches as they stand could really use more a more
> extended discussion of the design.
> 
> I.e. talk about the previously considered config schema, why it evolved
> into its current form. The trade-offs involved, and why the patch
> proposed to implement the schema it's implementing over another earlier
> or alternate design.
> 
> I.e. https://lore.kernel.org/git/20210819033450.3382652-6-emilyshaffer@google.com
> is two very short paragraphs. We won't be able to summarize all our
> month-long discussion on the config design in one commit message, but I
> think at least discussing it somewhat / linking to relevant on-list
> discussions would make future source spelunking easier.

Hum. I don't think that it's necessary to summarize the whole discussion
in the commit message, but I think it's worth it to describe the
rationale ("we like this design because it makes x good practice easy
and y bug hard") and link out to the discussion for parties who are
interested in reading more.

> 
> 2)
> 
> So that out of the way, a comment on the current config design, which
> should be read in the context of what I noted in #0. I.e. I'm *much*
> happier with this version.
> 
> That being said I'm still not convinced that the simple 1=1 mapping of
> "hook.<name>.command" and its "value" should be followed by the 1=many
> mapping of "hook.<name>.event" and its "value".
> 
> I.e. we're making the trade-off of saving the user from typing out or
> specifying:
>     
>     [hook "my-pre-commit"]
>     command = ~/hooks/pre-commit-or-push
>     event = pre-commit
>     [hook "my-pre-push"]
>     command = ~/hooks/pre-commit-or-push
>     event = pre-push
> 
> And instead being able to do:
> 
>     [hook "my-pre-commit-or-push"]
>     command = ~/hooks/pre-commit-or-push
>     event = pre-commit
>     event = pre-push
> 
> So for the very common case, saving two config lines. "Two" because as
> we discussed[1] as there's currently no GIT_HOOK_TYPE env var. So this
> form will work pretty much only for that case.
> 
> I.e. unlike with .git/hook/<name> the hook run via config can't
> determine what <hook-type> it's being run at, so as it stands this is
> only useful for those hooks listed in githooks(5) where someone would
> want to do the exact same thing for one or more <hook-name> names. You
> can't use it as a general routing mechanism for any hook type as it
> stands.
> 
> I *think* that's only these two, perhaps "update" and "pre-receive",
> with the hook seeing if it consume stdin/has arguments to disambiguate
> the two.
> 
> But even with a GIT_HOOK_TYPE passed the trade-off, as discussed in [1],
> and downthread in [2], is that by having it 1=many we're closing the
> door on any future hook.<name>.<whatever>. I.e. config that would change
> the behavior of that hook, but you'd want to change it in another way
> for at least one of the N event types.

I'm not really sure that's the case, to be honest. Even with the config
scheme as is in this iteration, you can still define it the way you're
describing with no problem, and in fact it makes it easier for users to
apply some special config to all-but-one invocation.

Let's say for example the "git-secrets" hook, which we do have defined
for at least 3 different events at Google today, and a hypothetical
"parallelize me" config; for the sake of argument let's presume that
this is the far future and we've added a GIT_HOOK_TYPE envvar:

[hook "git-secrets"]
  command = /bin/git-secrets
  event = pre-commit
  event = pre-merge-commit
  parallelizable = true

[hook "git-secrets-mutexed"]
  command = /bin/git-secrets
  event = prepare-commit-msg
  event = commit-msg
  parallelizable = false

If we don't allow "hook.myhookname.event" to be multiply configurable,
then the user gets this really tedious task of defining every single
"hook.git-secrets-$hookevent.parallelizable" config.

> 
> Well, "closing the door" as in if you'd want that you'd have to split up
> the section from the "my-pre-commit-or-push" example above to the
> "my-pre-commit" and "my-pre-push" example.
> 
> But again, on the "is the complexity worth it" we're then having to
> explain to users that they can do it one way if the want no config other
> than hook.<name>.{command,event*}, but another if they have another key
> in that namespace.

I am not so worried about "this will be hard to explain, so we should
not do it" - I think we can make the documentation useful with enough
effort and expertise. (And yes, I'm feeling optimistic because I have an
actual technical writer taking a look at the manpage right now.)

> 
> You've said that you wanted to add something like a GIT_HOOK_TYPE
> environment variable. Fair enough, and I guess we could add it in a
> re-roll of this series. I'm mainly commenting on the end-state of *this*
> series in particular. I.e. I think it leaves the user & implementation
> with a config schema that still seems to be needlessly complex for the
> very limited benefits that complexity brings us in what you're able to
> do with it now.

I think that limiting ourselves in the way you're describing will make
it more difficult to bring additional benefits later. It is certainly
possible and valid to write your configuration the way you are
describing, without changing this schema.

> 
> But some of that goes back to the comments I had on 5/6[3], i.e. I'm
> willing to be convinced, but I think that the current commit message &
> added docs aren't really selling the idea of why it's worth it.

Ok. I think your point is not "the schema is still wrong" as much as it
is "the documentation could be much better", and I agree.

> 
> 3) 
> 
> As an extension to my comments on 5/6[3], I think this whole notion of
> "git hook run <name>" as invoked by a user of git is just more confusing
> the more I think about it.
> 
> I.e. 5/6[4] is apparently seeking to implement a way to just make that
> facility a general way for users to run some command on their system to
> do whatever, instead of say using /usr/bin/parallel or a shell alias.
> 
> But then we also use that command as our own dispatch mechanism for our
> own known hooks, except mostly not, since we mostly use the C API
> ourselves directly.
> 
> It's particularly confusing that if you say run "git hook run
> pre-auto-gc" as a user to test your hook you'll have that hook run in
> the same way that git-gc(1) would run it. So someone developing a hook
> might think they can use "git hook run" for testing it.
> 
> But if you do the same with say "git hook run pre-receive" or anything
> else that feeds arguments or stdin (e.g. "update", or "pre-receive"),
> you'll have your hook happily being run by git, but in a way that's not
> at all how such a hook will be run when it's run by git "for real".
> 
> So I wonder if we shouldn't just have the thing die() if you try to run
> any hook that's in githooks(5) itself, except for sendemail-validate and
> the p4 hooks, since we need to run those ourselves.

I think this again falls to a documentation issue. I would love to have
a tighter loop when developing a hook that takes weird mid-merge
arguments or whatever; making it easier to test the hook in isolation
with known inputs sounds like a good thing to me.

Personally, "I ran 'git hook run pre-receive' without being in the
middle of a receive operation and it didn't behave like it was in the
middle of a receive operation!" doesn't sound all that surprising to me.

> 
> Or have those use an internal-only "git hook--helper", and start out
> with "git hook" just supporting "git hook list", and then later on have
> "git hook run" (or perhaps "git hook run-configured"?) be an entry point
> for this facility of running some arbitrary script that's not a "real"
> hook.
> 
> I don't know, maybe I'm the only one that finds this confusing...
> 
> 1. https://lore.kernel.org/git/87bl6ttmlv.fsf@evledraar.gmail.com/
> 2. https://lore.kernel.org/git/877dh0n1b3.fsf@evledraar.gmail.com/
> 3. https://lore.kernel.org/git/87lf4qeh86.fsf@evledraar.gmail.com/
> 4. https://lore.kernel.org/git/20210819033450.3382652-6-emilyshaffer@google.com/

Besides "make the docs more obvious", I don't think there is anything in
this mail that I want to act on. I am very happy with the config schema
as it is, as well as the behavior of 'git hook run'.


As I mentioned here and somewhere else in the review, I am in the
process of getting feedback on the manpage from a tech writer this week,
so do not expect a reroll from me until at least next week. I saw your
reroll today and I'll try and look at it (or at least the interdiff)
tomorrow or Monday.

 - Emily

[v3,0/6] config-based hooks restarted

Message

Comments