[v2,1/6] hook: run a list of hooks instead

Message ID	20210812004258.74318-2-emilyshaffer@google.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <git-owner@kernel.org> Date: Wed, 11 Aug 2021 17:42:53 -0700 In-Reply-To: <20210812004258.74318-1-emilyshaffer@google.com> Message-Id: <20210812004258.74318-2-emilyshaffer@google.com> Mime-Version: 1.0 References: <20210812004258.74318-1-emilyshaffer@google.com> Subject: [PATCH v2 1/6] hook: run a list of hooks instead From: Emily Shaffer <emilyshaffer@google.com> To: git@vger.kernel.org Cc: Emily Shaffer <emilyshaffer@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	config-based hooks restarted \| expand [v2,0/6] config-based hooks restarted [v2,1/6] hook: run a list of hooks instead [v2,2/6] hook: allow parallel hook execution [v2,3/6] hook: introduce "git hook list" [v2,4/6] hook: allow running non-native hooks [v2,5/6] hook: include hooks from the config [v2,6/6] hook: allow out-of-repo 'git hook' invocations

Message ID

20210812004258.74318-2-emilyshaffer@google.com (mailing list archive)

State

Superseded

Headers

Date: Wed, 11 Aug 2021 17:42:53 -0700
In-Reply-To: <20210812004258.74318-1-emilyshaffer@google.com>
Message-Id: <20210812004258.74318-2-emilyshaffer@google.com>
Mime-Version: 1.0
References: <20210812004258.74318-1-emilyshaffer@google.com>
Subject: [PATCH v2 1/6] hook: run a list of hooks instead
From: Emily Shaffer <emilyshaffer@google.com>
To: git@vger.kernel.org
Cc: Emily Shaffer <emilyshaffer@google.com>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

config-based hooks restarted | expand

Commit Message

Emily Shaffer Aug. 12, 2021, 12:42 a.m. UTC

To prepare for multihook support, teach hook.[hc] to take a list of
hooks at run_hooks and run_found_hooks. Right now the list is always one
entry, but in the future we will allow users to supply more than one
executable for a single hook event.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/hook.c |  14 ++++---
 hook.c         | 103 +++++++++++++++++++++++++++++++++++--------------
 hook.h         |  16 +++++++-
 3 files changed, 96 insertions(+), 37 deletions(-)

Comments

Junio C Hamano Aug. 12, 2021, 5:25 p.m. UTC | #1

Emily Shaffer <emilyshaffer@google.com> writes:

> To prepare for multihook support, teach hook.[hc] to take a list of
> hooks at run_hooks and run_found_hooks. Right now the list is always one
> entry, but in the future we will allow users to supply more than one
> executable for a single hook event.
>
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  builtin/hook.c |  14 ++++---
>  hook.c         | 103 +++++++++++++++++++++++++++++++++++--------------
>  hook.h         |  16 +++++++-
>  3 files changed, 96 insertions(+), 37 deletions(-)
>
> diff --git a/builtin/hook.c b/builtin/hook.c
> index 5eb7cf73a4..4d39c9e75e 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -25,7 +25,8 @@ static int run(int argc, const char **argv, const char *prefix)
>  	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
>  	int ignore_missing = 0;
>  	const char *hook_name;
> -	const char *hook_path;
> +	struct list_head *hooks;
> +

Natural.  We used to use the path to the hook because we were
expecting only one. We now use the name to find a list of hooks.

All the caller sees is just list_head without any direct visibility
into it, which feels like a great abstraction.  Presumably everything
will go through the API functions taking this opaque "list of hooks"
thing (or "the first one in the list" if the API function does not
iterate over it, perhaps?).

>  	struct option run_options[] = {
>  		OPT_BOOL(0, "ignore-missing", &ignore_missing,
>  			 N_("exit quietly with a zero exit code if the requested hook cannot be found")),
> @@ -58,15 +59,16 @@ static int run(int argc, const char **argv, const char *prefix)
>  	git_config(git_default_config, NULL);
>  
>  	hook_name = argv[0];
> -	if (ignore_missing)
> -		return run_hooks_oneshot(hook_name, &opt);
> -	hook_path = find_hook(hook_name);
> -	if (!hook_path) {
> +	hooks = hook_list(hook_name);

OK.

> +	if (list_empty(hooks)) {
> +		/* ... act like run_hooks_oneshot() under --ignore-missing */
> +		if (ignore_missing)
> +			return 0;

OK.

>  		error("cannot find a hook named %s", hook_name);
>  		return 1;
>  	}

> diff --git a/hook.c b/hook.c
> index ee20b2e365..80e150548c 100644
> --- a/hook.c
> +++ b/hook.c
> @@ -4,6 +4,28 @@
>  #include "hook-list.h"
>  #include "config.h"
>  
> +static void free_hook(struct hook *ptr)
> +{
> +	if (ptr) {
> +		free(ptr->feed_pipe_cb_data);
> +	}

Lose the extra {}, as we do not do more than the above free() even
at the end of the series?

> +	free(ptr);
> +}
> +
> +static void remove_hook(struct list_head *to_remove)
> +{
> +	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
> +	list_del(to_remove);
> +	free_hook(hook_to_remove);
> +}
> +
> +void clear_hook_list(struct list_head *head)
> +{
> +	struct list_head *pos, *tmp;
> +	list_for_each_safe(pos, tmp, head)
> +		remove_hook(pos);
> +}

OK.

> +struct list_head* hook_list(const char* hookname)

Shift both of the asterisks; our asterisks do not stick to the type
but to the identifier.

> +{
> +	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
> +
> +	INIT_LIST_HEAD(hook_head);
> +
> +	if (!hookname)
> +		return NULL;

Checking for invalid hookname first would avoid leaking hook_head,
no?  The caller of hook_list() we saw earlier immediately calls
list_empty() which will segfault.  The caller need to be tightened,
but I wonder if it would be a programmer's error to pass a NULL
hookname to this function.  If so, this can simply be a BUG() and
the earlier allocation and initialization of hook_head can stay
where they are.  Otherwise, the caller should see if this returns a
NULL.

> +	if (have_git_dir()) {
> +		const char *hook_path = find_hook(hookname);
> +
> +		/* Add the hook from the hookdir */
> +		if (hook_path) {
> +			struct hook *to_add = xmalloc(sizeof(*to_add));
> +			to_add->hook_path = hook_path;
> +			to_add->feed_pipe_cb_data = NULL;
> +			list_add_tail(&to_add->list, hook_head);
> +		}
> +	}

Calling this function to grab a list of hooks when we are not in any
repository is not an error but just we get "there is nothing to
run".  Does the design give us a more useful behaviour, compared to
alternatives like "you have to be in a repository or calling this
function is an error"?

Not an objection wrapped in a rhetorical question, but a genuine
question.  "It would help this and that usecase" would be an ideal
answer, "We could do either way, but we happened to have written the
code this way first, and at the end of the series we did not see any
practical downsides" would also be a great answer.

> +	return hook_head;
> +}
> +

> +/*
> + * Provides a linked list of 'struct hook' detailing commands which should run
> + * in response to the 'hookname' event, in execution order.
> + */
> +struct list_head* hook_list(const char *hookname);

struct list_head *hook_list(const char *hookname);

Emily Shaffer Aug. 16, 2021, 11:38 p.m. UTC | #2

On Thu, Aug 12, 2021 at 10:25:03AM -0700, Junio C Hamano wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> > To prepare for multihook support, teach hook.[hc] to take a list of
> > hooks at run_hooks and run_found_hooks. Right now the list is always one
> > entry, but in the future we will allow users to supply more than one
> > executable for a single hook event.
> >
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  builtin/hook.c |  14 ++++---
> >  hook.c         | 103 +++++++++++++++++++++++++++++++++++--------------
> >  hook.h         |  16 +++++++-
> >  3 files changed, 96 insertions(+), 37 deletions(-)
> >
> > diff --git a/builtin/hook.c b/builtin/hook.c
> > index 5eb7cf73a4..4d39c9e75e 100644
> > --- a/builtin/hook.c
> > +++ b/builtin/hook.c
> > @@ -25,7 +25,8 @@ static int run(int argc, const char **argv, const char *prefix)
> >  	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
> >  	int ignore_missing = 0;
> >  	const char *hook_name;
> > -	const char *hook_path;
> > +	struct list_head *hooks;
> > +
> 
> Natural.  We used to use the path to the hook because we were
> expecting only one. We now use the name to find a list of hooks.
> 
> All the caller sees is just list_head without any direct visibility
> into it, which feels like a great abstraction.  Presumably everything
> will go through the API functions taking this opaque "list of hooks"
> thing (or "the first one in the list" if the API function does not
> iterate over it, perhaps?).

Hum, I guess that in a later patch builtin/hook.c does learn to take
apart the list_head into a 'struct hook' to print the output of 'git
hook list'. I haven't read your review of that patch yet though.

> > diff --git a/hook.c b/hook.c
> > index ee20b2e365..80e150548c 100644
> > --- a/hook.c
> > +++ b/hook.c
> > @@ -4,6 +4,28 @@
> >  #include "hook-list.h"
> >  #include "config.h"
> >  
> > +static void free_hook(struct hook *ptr)
> > +{
> > +	if (ptr) {
> > +		free(ptr->feed_pipe_cb_data);
> > +	}
> 
> Lose the extra {}, as we do not do more than the above free() even
> at the end of the series?

ACK

> 
> > +struct list_head* hook_list(const char* hookname)
> 
> Shift both of the asterisks; our asterisks do not stick to the type
> but to the identifier.

ACK

> 
> > +{
> > +	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
> > +
> > +	INIT_LIST_HEAD(hook_head);
> > +
> > +	if (!hookname)
> > +		return NULL;
> 
> Checking for invalid hookname first would avoid leaking hook_head,
> no?  The caller of hook_list() we saw earlier immediately calls
> list_empty() which will segfault.  The caller need to be tightened,
> but I wonder if it would be a programmer's error to pass a NULL
> hookname to this function.  If so, this can simply be a BUG() and
> the earlier allocation and initialization of hook_head can stay
> where they are.  Otherwise, the caller should see if this returns a
> NULL.

Ah, good point. I think this makes sense to BUG().

> 
> > +	if (have_git_dir()) {
> > +		const char *hook_path = find_hook(hookname);
> > +
> > +		/* Add the hook from the hookdir */
> > +		if (hook_path) {
> > +			struct hook *to_add = xmalloc(sizeof(*to_add));
> > +			to_add->hook_path = hook_path;
> > +			to_add->feed_pipe_cb_data = NULL;
> > +			list_add_tail(&to_add->list, hook_head);
> > +		}
> > +	}
> 
> Calling this function to grab a list of hooks when we are not in any
> repository is not an error but just we get "there is nothing to
> run".  Does the design give us a more useful behaviour, compared to
> alternatives like "you have to be in a repository or calling this
> function is an error"?

Later we enable calling hook_list without a gitdir, in patch 6
(https://lore.kernel.org/git/20210812004258.74318-7-emilyshaffer%40google.com).
So maybe the behavior as it is now is premature?

But leaving the hook list empty means we can behave gracefully if
anybody is calling a hook without necessarily being sure they have
gitdir. I would need to audit callsites to check if they are checking
whether they have one or not. I think in most cases they probably are
checking that.

> 
> Not an objection wrapped in a rhetorical question, but a genuine
> question.  "It would help this and that usecase" would be an ideal
> answer, "We could do either way, but we happened to have written the
> code this way first, and at the end of the series we did not see any
> practical downsides" would also be a great answer.

The main use case, today, for letting this work nicely even outside of
a gitdir, is sendemail-validate hook. But mostly, I was thinking that
when we're freed from dependence on .git/hooks/, there's no reason to
disallow out-of-repo hooks in case someone wants to add a new one in the
future - for example to 'git maintenance' daemon runs.

> 
> > +	return hook_head;
> > +}
> > +
> 
> > +/*
> > + * Provides a linked list of 'struct hook' detailing commands which should run
> > + * in response to the 'hookname' event, in execution order.
> > + */
> > +struct list_head* hook_list(const char *hookname);
> 
> struct list_head *hook_list(const char *hookname);
ACK

Thanks for the thoughts.
 - Emily

diff --git a/builtin/hook.c b/builtin/hook.c
index 5eb7cf73a4..4d39c9e75e 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -25,7 +25,8 @@  static int run(int argc, const char **argv, const char *prefix)
 	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
 	int ignore_missing = 0;
 	const char *hook_name;
-	const char *hook_path;
+	struct list_head *hooks;
+
 	struct option run_options[] = {
 		OPT_BOOL(0, "ignore-missing", &ignore_missing,
 			 N_("exit quietly with a zero exit code if the requested hook cannot be found")),
@@ -58,15 +59,16 @@  static int run(int argc, const char **argv, const char *prefix)
 	git_config(git_default_config, NULL);
 
 	hook_name = argv[0];
-	if (ignore_missing)
-		return run_hooks_oneshot(hook_name, &opt);
-	hook_path = find_hook(hook_name);
-	if (!hook_path) {
+	hooks = hook_list(hook_name);
+	if (list_empty(hooks)) {
+		/* ... act like run_hooks_oneshot() under --ignore-missing */
+		if (ignore_missing)
+			return 0;
 		error("cannot find a hook named %s", hook_name);
 		return 1;
 	}
 
-	ret = run_hooks(hook_name, hook_path, &opt);
+	ret = run_hooks(hook_name, hooks, &opt);
 	run_hooks_opt_clear(&opt);
 	return ret;
 usage:
diff --git a/hook.c b/hook.c
index ee20b2e365..80e150548c 100644
--- a/hook.c
+++ b/hook.c
@@ -4,6 +4,28 @@ 
 #include "hook-list.h"
 #include "config.h"
 
+static void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		free(ptr->feed_pipe_cb_data);
+	}
+	free(ptr);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(struct list_head *head)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, head)
+		remove_hook(pos);
+}
+
 static int known_hook(const char *name)
 {
 	const char **p;
@@ -71,6 +93,30 @@  int hook_exists(const char *name)
 	return !!find_hook(name);
 }
 
+struct list_head* hook_list(const char* hookname)
+{
+	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
+
+	INIT_LIST_HEAD(hook_head);
+
+	if (!hookname)
+		return NULL;
+
+	if (have_git_dir()) {
+		const char *hook_path = find_hook(hookname);
+
+		/* Add the hook from the hookdir */
+		if (hook_path) {
+			struct hook *to_add = xmalloc(sizeof(*to_add));
+			to_add->hook_path = hook_path;
+			to_add->feed_pipe_cb_data = NULL;
+			list_add_tail(&to_add->list, hook_head);
+		}
+	}
+
+	return hook_head;
+}
+
 void run_hooks_opt_clear(struct run_hooks_opt *o)
 {
 	strvec_clear(&o->env);
@@ -128,7 +174,10 @@  static int pick_next_hook(struct child_process *cp,
 	cp->dir = hook_cb->options->dir;
 
 	/* add command */
-	strvec_push(&cp->args, run_me->hook_path);
+	if (hook_cb->options->absolute_path)
+		strvec_push(&cp->args, absolute_path(run_me->hook_path));
+	else
+		strvec_push(&cp->args, run_me->hook_path);
 
 	/*
 	 * add passed-in argv, without expanding - let the user get back
@@ -139,12 +188,12 @@  static int pick_next_hook(struct child_process *cp,
 	/* Provide context for errors if necessary */
 	*pp_task_cb = run_me;
 
-	/*
-	 * This pick_next_hook() will be called again, we're only
-	 * running one hook, so indicate that no more work will be
-	 * done.
-	 */
-	hook_cb->run_me = NULL;
+	/* Get the next entry ready */
+	if (hook_cb->run_me->list.next == hook_cb->head)
+		hook_cb->run_me = NULL;
+	else
+		hook_cb->run_me = list_entry(hook_cb->run_me->list.next,
+					     struct hook, list);
 
 	return 1;
 }
@@ -179,13 +228,9 @@  static int notify_hook_finished(int result,
 	return 0;
 }
 
-int run_hooks(const char *hook_name, const char *hook_path,
-	      struct run_hooks_opt *options)
+int run_hooks(const char *hook_name, struct list_head *hooks,
+		    struct run_hooks_opt *options)
 {
-	struct strbuf abs_path = STRBUF_INIT;
-	struct hook my_hook = {
-		.hook_path = hook_path,
-	};
 	struct hook_cb_data cb_data = {
 		.rc = 0,
 		.hook_name = hook_name,
@@ -197,11 +242,9 @@  int run_hooks(const char *hook_name, const char *hook_path,
 	if (!options)
 		BUG("a struct run_hooks_opt must be provided to run_hooks");
 
-	if (options->absolute_path) {
-		strbuf_add_absolute_path(&abs_path, hook_path);
-		my_hook.hook_path = abs_path.buf;
-	}
-	cb_data.run_me = &my_hook;
+
+	cb_data.head = hooks;
+	cb_data.run_me = list_first_entry(hooks, struct hook, list);
 
 	run_processes_parallel_tr2(jobs,
 				   pick_next_hook,
@@ -213,18 +256,15 @@  int run_hooks(const char *hook_name, const char *hook_path,
 				   "hook",
 				   hook_name);
 
-
-	if (options->absolute_path)
-		strbuf_release(&abs_path);
-	free(my_hook.feed_pipe_cb_data);
+	clear_hook_list(hooks);
 
 	return cb_data.rc;
 }
 
 int run_hooks_oneshot(const char *hook_name, struct run_hooks_opt *options)
 {
-	const char *hook_path;
-	int ret;
+	struct list_head *hooks;
+	int ret = 0;
 	struct run_hooks_opt hook_opt_scratch = RUN_HOOKS_OPT_INIT;
 
 	if (!options)
@@ -233,14 +273,19 @@  int run_hooks_oneshot(const char *hook_name, struct run_hooks_opt *options)
 	if (options->path_to_stdin && options->feed_pipe)
 		BUG("choose only one method to populate stdin");
 
-	hook_path = find_hook(hook_name);
-	if (!hook_path) {
-		ret = 0;
+	hooks = hook_list(hook_name);
+
+	/*
+	 * If you need to act on a missing hook, use run_found_hooks()
+	 * instead
+	 */
+	if (list_empty(hooks))
 		goto cleanup;
-	}
 
-	ret = run_hooks(hook_name, hook_path, options);
+	ret = run_hooks(hook_name, hooks, options);
+
 cleanup:
 	run_hooks_opt_clear(options);
+	clear_hook_list(hooks);
 	return ret;
 }
diff --git a/hook.h b/hook.h
index 58dfbf474c..7705e6a529 100644
--- a/hook.h
+++ b/hook.h
@@ -3,6 +3,7 @@ 
 #include "strbuf.h"
 #include "strvec.h"
 #include "run-command.h"
+#include "list.h"
 
 /*
  * Returns the path to the hook file, or NULL if the hook is missing
@@ -17,6 +18,7 @@  const char *find_hook(const char *name);
 int hook_exists(const char *hookname);
 
 struct hook {
+	struct list_head list;
 	/* The path to the hook */
 	const char *hook_path;
 
@@ -27,6 +29,12 @@  struct hook {
 	void *feed_pipe_cb_data;
 };
 
+/*
+ * Provides a linked list of 'struct hook' detailing commands which should run
+ * in response to the 'hookname' event, in execution order.
+ */
+struct list_head* hook_list(const char *hookname);
+
 struct run_hooks_opt
 {
 	/* Environment vars to be set for each hook */
@@ -97,6 +105,7 @@  struct hook_cb_data {
 	/* rc reflects the cumulative failure state */
 	int rc;
 	const char *hook_name;
+	struct list_head *head;
 	struct hook *run_me;
 	struct run_hooks_opt *options;
 	int *invoked_hook;
@@ -110,8 +119,8 @@  void run_hooks_opt_clear(struct run_hooks_opt *o);
  *
  * See run_hooks_oneshot() for the simpler one-shot API.
  */
-int run_hooks(const char *hookname, const char *hook_path,
-	      struct run_hooks_opt *options);
+int run_hooks(const char *hookname, struct list_head *hooks,
+		    struct run_hooks_opt *options);
 
 /**
  * Calls find_hook() on your "hook_name" and runs the hooks (if any)
@@ -123,4 +132,7 @@  int run_hooks(const char *hookname, const char *hook_path,
  */
 int run_hooks_oneshot(const char *hook_name, struct run_hooks_opt *options);
 
+/* Empties the list at 'head', calling 'free_hook()' on each entry */
+void clear_hook_list(struct list_head *head);
+
 #endif

[v2,1/6] hook: run a list of hooks instead

Commit Message

Comments

Patch