Message ID | 20191025025129.250049-6-emilyshaffer@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | add git-bugreport tool | expand |
Hi Emily, On Thu, 24 Oct 2019, Emily Shaffer wrote: > Occasionally a failure a user is seeing may be related to a specific > hook which is being run, perhaps without the user realizing. While the > contents of hooks can be sensitive - containing user data or process > information specific to the user's organization - simply knowing that a > hook is being run at a certain stage can help us to understand whether > something is going wrong. > > Without a definitive list of hook names within the code, we compile our > own list from the documentation. This is likely prone to bitrot. To > reduce the amount of code humans need to read, we turn the list into a > string_list and iterate over it (as we are calling the same find_hook > operation on each string). However, since bugreport should primarily be > called by the user, the performance loss from massaging the string > seems acceptable. Good idea! > > Signed-off-by: Emily Shaffer <emilyshaffer@google.com> > --- > bugreport.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ > bugreport.h | 6 ++++++ > builtin/bugreport.c | 4 ++++ > 3 files changed, 54 insertions(+) > > diff --git a/bugreport.c b/bugreport.c > index afa4836ab1..9d7f44ff28 100644 > --- a/bugreport.c > +++ b/bugreport.c > @@ -103,3 +103,47 @@ void get_whitelisted_config(struct strbuf *config_info) > strbuf_reset(config_info); > strbuf_addbuf(config_info, &configs_and_values); > } > + > +void get_populated_hooks(struct strbuf *hook_info) > +{ > + /* > + * Doesn't look like there is a list of all possible hooks; so below is > + * a transcription of `git help hook`. > + */ > + const char *hooks = "applypatch-msg," > + "pre-applypatch," > + "post-applypatch," > + "pre-commit," > + "pre-merge-commit," > + "prepare-commit-msg," > + "commit-msg," > + "post-commit," > + "pre-rebase," > + "post-checkout," > + "post-merge," > + "pre-push," > + "pre-receive," > + "update," > + "post-receive," > + "post-update," > + "push-to-checkout," > + "pre-auto-gc," > + "post-rewrite," > + "sendemail-validate," > + "fsmonitor-watchman," > + "p4-pre-submit," > + "post-index-changex"; Well, this is disappointing: I tried to come up with a scripted way to extract the commit names from our source code, and I failed! This is where I gave up: git grep run_hook | sed -n 's/.*run_hook[^(]*([^,]*, *\([^,]*\).*/\1/p' | sed -e '/^name$/d' -e '/^const char \*name$/d' \ -e 's/push_to_checkout_hook/"push-to-checkout"/' | sort | uniq This prints only the following hook names: "applypatch-msg" "post-applypatch" "post-checkout" "post-index-change" "post-merge" "pre-applypatch" "pre-auto-gc" "pre-rebase" "prepare-commit-msg" "push-to-checkout" But at least it was easy to script the extracting from the Documentation: sed -n '/^[a-z]/{N;/\n~~~/{s/\n.*//;p}}' \ Documentation/githooks.txt > + struct string_list hooks_list = STRING_LIST_INIT_DUP; > + struct string_list_item *iter = NULL; > + > + strbuf_reset(hook_info); > + > + string_list_split(&hooks_list, hooks, ',', -1); > + > + for_each_string_list_item(iter, &hooks_list) { This should definitely be done at compile time, I think. We should be able to generate an appropriate header via something like this: cat >hook-names.h <<-EOF static const char *hook_names[] = { $(sed -n '/^[a-z]/{N;/\n~~~/{s/\n.*/",/;s/^/ "/;p}}' \ Documentation/githooks.txt) }; EOF Then you would use a simple `for()` loop using `ARRAY_SIZE()` to iterate over the hook names. > + if (find_hook(iter->string)) { > + strbuf_addstr(hook_info, iter->string); > + strbuf_complete_line(hook_info); > + } > + } > +} > diff --git a/bugreport.h b/bugreport.h > index 7413e7e1be..942a5436e3 100644 > --- a/bugreport.h > +++ b/bugreport.h > @@ -12,3 +12,9 @@ void get_system_info(struct strbuf *sys_info); > * config_info will be discarded. > */ > void get_whitelisted_config(struct strbuf *sys_info); > + > +/** > + * Adds the paths to all configured hooks (but not their contents). The previous > + * contents of hook_info will be discarded. > + */ > +void get_populated_hooks(struct strbuf *hook_info); > diff --git a/builtin/bugreport.c b/builtin/bugreport.c > index 70fe0d2b85..a0eefba498 100644 > --- a/builtin/bugreport.c > +++ b/builtin/bugreport.c > @@ -60,6 +60,10 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix) > get_whitelisted_config(&buffer); > strbuf_write(&buffer, report); > > + add_header(report, "Populated Hooks"); Wait! I should have stumbled over this in an earlier patch. The `add_header()` function should not take a `FILE *` parameter at all, but instead an `struct strbuf *` one! Ciao, Dscho > + get_populated_hooks(&buffer); > + strbuf_write(&buffer, report); > + > fclose(report); > > launch_editor(report_path.buf, NULL, NULL); > -- > 2.24.0.rc0.303.g954a862665-goog > >
On Mon, Oct 28, 2019 at 03:31:36PM +0100, Johannes Schindelin wrote: > Hi Emily, > > On Thu, 24 Oct 2019, Emily Shaffer wrote: > > > Occasionally a failure a user is seeing may be related to a specific > > hook which is being run, perhaps without the user realizing. While the > > contents of hooks can be sensitive - containing user data or process > > information specific to the user's organization - simply knowing that a > > hook is being run at a certain stage can help us to understand whether > > something is going wrong. > > > > Without a definitive list of hook names within the code, we compile our > > own list from the documentation. This is likely prone to bitrot. To > > reduce the amount of code humans need to read, we turn the list into a > > string_list and iterate over it (as we are calling the same find_hook > > operation on each string). However, since bugreport should primarily be > > called by the user, the performance loss from massaging the string > > seems acceptable. > > Good idea! > > > > > Signed-off-by: Emily Shaffer <emilyshaffer@google.com> > > --- > > bugreport.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ > > bugreport.h | 6 ++++++ > > builtin/bugreport.c | 4 ++++ > > 3 files changed, 54 insertions(+) > > > > diff --git a/bugreport.c b/bugreport.c > > index afa4836ab1..9d7f44ff28 100644 > > --- a/bugreport.c > > +++ b/bugreport.c > > @@ -103,3 +103,47 @@ void get_whitelisted_config(struct strbuf *config_info) > > strbuf_reset(config_info); > > strbuf_addbuf(config_info, &configs_and_values); > > } > > + > > +void get_populated_hooks(struct strbuf *hook_info) > > +{ > > + /* > > + * Doesn't look like there is a list of all possible hooks; so below is > > + * a transcription of `git help hook`. > > + */ > > + const char *hooks = "applypatch-msg," > > + "pre-applypatch," > > + "post-applypatch," > > + "pre-commit," > > + "pre-merge-commit," > > + "prepare-commit-msg," > > + "commit-msg," > > + "post-commit," > > + "pre-rebase," > > + "post-checkout," > > + "post-merge," > > + "pre-push," > > + "pre-receive," > > + "update," > > + "post-receive," > > + "post-update," > > + "push-to-checkout," > > + "pre-auto-gc," > > + "post-rewrite," > > + "sendemail-validate," > > + "fsmonitor-watchman," > > + "p4-pre-submit," > > + "post-index-changex"; > > Well, this is disappointing: I tried to come up with a scripted way to > extract the commit names from our source code, and I failed! This is > where I gave up: > > git grep run_hook | > sed -n 's/.*run_hook[^(]*([^,]*, *\([^,]*\).*/\1/p' | > sed -e '/^name$/d' -e '/^const char \*name$/d' \ > -e 's/push_to_checkout_hook/"push-to-checkout"/' | > sort | > uniq > > This prints only the following hook names: > > "applypatch-msg" > "post-applypatch" > "post-checkout" > "post-index-change" > "post-merge" > "pre-applypatch" > "pre-auto-gc" > "pre-rebase" > "prepare-commit-msg" > "push-to-checkout" > > But at least it was easy to script the extracting from the > Documentation: > > sed -n '/^[a-z]/{N;/\n~~~/{s/\n.*//;p}}' \ > Documentation/githooks.txt > To be totally frank, I'm not keen on spending a lot of time with scripting this. My parallel work with config-based hooks will also involve an in-code source of truth for available hooknames; I'd like to punt on some scripting, putting it in the makefile, etc blah since I know I'm planning to fix this particular annoyance at the source - and since it looks like bugreport will stay in the review phase for at least a little longer. > > + struct string_list hooks_list = STRING_LIST_INIT_DUP; > > + struct string_list_item *iter = NULL; > > + > > + strbuf_reset(hook_info); > > + > > + string_list_split(&hooks_list, hooks, ',', -1); > > + > > + for_each_string_list_item(iter, &hooks_list) { > > This should definitely be done at compile time, I think. We should be > able to generate an appropriate header via something like this: > > cat >hook-names.h <<-EOF > static const char *hook_names[] = { > $(sed -n '/^[a-z]/{N;/\n~~~/{s/\n.*/",/;s/^/ "/;p}}' \ > Documentation/githooks.txt) > }; > EOF > > Then you would use a simple `for()` loop using `ARRAY_SIZE()` to > iterate over the hook names. > > > + if (find_hook(iter->string)) { > > + strbuf_addstr(hook_info, iter->string); > > + strbuf_complete_line(hook_info); > > + } > > + } > > +} > > diff --git a/bugreport.h b/bugreport.h > > index 7413e7e1be..942a5436e3 100644 > > --- a/bugreport.h > > +++ b/bugreport.h > > @@ -12,3 +12,9 @@ void get_system_info(struct strbuf *sys_info); > > * config_info will be discarded. > > */ > > void get_whitelisted_config(struct strbuf *sys_info); > > + > > +/** > > + * Adds the paths to all configured hooks (but not their contents). The previous > > + * contents of hook_info will be discarded. > > + */ > > +void get_populated_hooks(struct strbuf *hook_info); > > diff --git a/builtin/bugreport.c b/builtin/bugreport.c > > index 70fe0d2b85..a0eefba498 100644 > > --- a/builtin/bugreport.c > > +++ b/builtin/bugreport.c > > @@ -60,6 +60,10 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix) > > get_whitelisted_config(&buffer); > > strbuf_write(&buffer, report); > > > > + add_header(report, "Populated Hooks"); > > Wait! I should have stumbled over this in an earlier patch. The > `add_header()` function should not take a `FILE *` parameter at all, but > instead an `struct strbuf *` one! > > Ciao, > Dscho > > > + get_populated_hooks(&buffer); > > + strbuf_write(&buffer, report); > > + > > fclose(report); > > > > launch_editor(report_path.buf, NULL, NULL); > > -- > > 2.24.0.rc0.303.g954a862665-goog > > > >
Hi Emily, On Wed, 11 Dec 2019, Emily Shaffer wrote: > On Mon, Oct 28, 2019 at 03:31:36PM +0100, Johannes Schindelin wrote: > > > On Thu, 24 Oct 2019, Emily Shaffer wrote: > > > > > diff --git a/bugreport.c b/bugreport.c > > > index afa4836ab1..9d7f44ff28 100644 > > > --- a/bugreport.c > > > +++ b/bugreport.c > > > @@ -103,3 +103,47 @@ void get_whitelisted_config(struct strbuf *config_info) > > > strbuf_reset(config_info); > > > strbuf_addbuf(config_info, &configs_and_values); > > > } > > > + > > > +void get_populated_hooks(struct strbuf *hook_info) > > > +{ > > > + /* > > > + * Doesn't look like there is a list of all possible hooks; so below is > > > + * a transcription of `git help hook`. > > > + */ > > > + const char *hooks = "applypatch-msg," > > > + "pre-applypatch," > > > + "post-applypatch," > > > + "pre-commit," > > > + "pre-merge-commit," > > > + "prepare-commit-msg," > > > + "commit-msg," > > > + "post-commit," > > > + "pre-rebase," > > > + "post-checkout," > > > + "post-merge," > > > + "pre-push," > > > + "pre-receive," > > > + "update," > > > + "post-receive," > > > + "post-update," > > > + "push-to-checkout," > > > + "pre-auto-gc," > > > + "post-rewrite," > > > + "sendemail-validate," > > > + "fsmonitor-watchman," > > > + "p4-pre-submit," > > > + "post-index-changex"; > > > > Well, this is disappointing: I tried to come up with a scripted way to > > extract the commit names from our source code, and I failed! This is > > where I gave up: > > > > git grep run_hook | > > sed -n 's/.*run_hook[^(]*([^,]*, *\([^,]*\).*/\1/p' | > > sed -e '/^name$/d' -e '/^const char \*name$/d' \ > > -e 's/push_to_checkout_hook/"push-to-checkout"/' | > > sort | > > uniq > > > > This prints only the following hook names: > > > > "applypatch-msg" > > "post-applypatch" > > "post-checkout" > > "post-index-change" > > "post-merge" > > "pre-applypatch" > > "pre-auto-gc" > > "pre-rebase" > > "prepare-commit-msg" > > "push-to-checkout" > > > > But at least it was easy to script the extracting from the > > Documentation: > > > > sed -n '/^[a-z]/{N;/\n~~~/{s/\n.*//;p}}' \ > > Documentation/githooks.txt > > > > To be totally frank, I'm not keen on spending a lot of time with > scripting this. My parallel work with config-based hooks will also > involve an in-code source of truth for available hooknames; I'd like to > punt on some scripting, putting it in the makefile, etc blah since I > know I'm planning to fix this particular annoyance at the source - and > since it looks like bugreport will stay in the review phase for at least > a little longer. Fair enough, especially if it addresses my complaint about the scriptability. Ciao, Dscho
diff --git a/bugreport.c b/bugreport.c index afa4836ab1..9d7f44ff28 100644 --- a/bugreport.c +++ b/bugreport.c @@ -103,3 +103,47 @@ void get_whitelisted_config(struct strbuf *config_info) strbuf_reset(config_info); strbuf_addbuf(config_info, &configs_and_values); } + +void get_populated_hooks(struct strbuf *hook_info) +{ + /* + * Doesn't look like there is a list of all possible hooks; so below is + * a transcription of `git help hook`. + */ + const char *hooks = "applypatch-msg," + "pre-applypatch," + "post-applypatch," + "pre-commit," + "pre-merge-commit," + "prepare-commit-msg," + "commit-msg," + "post-commit," + "pre-rebase," + "post-checkout," + "post-merge," + "pre-push," + "pre-receive," + "update," + "post-receive," + "post-update," + "push-to-checkout," + "pre-auto-gc," + "post-rewrite," + "sendemail-validate," + "fsmonitor-watchman," + "p4-pre-submit," + "post-index-changex"; + struct string_list hooks_list = STRING_LIST_INIT_DUP; + struct string_list_item *iter = NULL; + + strbuf_reset(hook_info); + + string_list_split(&hooks_list, hooks, ',', -1); + + for_each_string_list_item(iter, &hooks_list) { + if (find_hook(iter->string)) { + strbuf_addstr(hook_info, iter->string); + strbuf_complete_line(hook_info); + } + } +} diff --git a/bugreport.h b/bugreport.h index 7413e7e1be..942a5436e3 100644 --- a/bugreport.h +++ b/bugreport.h @@ -12,3 +12,9 @@ void get_system_info(struct strbuf *sys_info); * config_info will be discarded. */ void get_whitelisted_config(struct strbuf *sys_info); + +/** + * Adds the paths to all configured hooks (but not their contents). The previous + * contents of hook_info will be discarded. + */ +void get_populated_hooks(struct strbuf *hook_info); diff --git a/builtin/bugreport.c b/builtin/bugreport.c index 70fe0d2b85..a0eefba498 100644 --- a/builtin/bugreport.c +++ b/builtin/bugreport.c @@ -60,6 +60,10 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix) get_whitelisted_config(&buffer); strbuf_write(&buffer, report); + add_header(report, "Populated Hooks"); + get_populated_hooks(&buffer); + strbuf_write(&buffer, report); + fclose(report); launch_editor(report_path.buf, NULL, NULL);
Occasionally a failure a user is seeing may be related to a specific hook which is being run, perhaps without the user realizing. While the contents of hooks can be sensitive - containing user data or process information specific to the user's organization - simply knowing that a hook is being run at a certain stage can help us to understand whether something is going wrong. Without a definitive list of hook names within the code, we compile our own list from the documentation. This is likely prone to bitrot. To reduce the amount of code humans need to read, we turn the list into a string_list and iterate over it (as we are calling the same find_hook operation on each string). However, since bugreport should primarily be called by the user, the performance loss from massaging the string seems acceptable. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> --- bugreport.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ bugreport.h | 6 ++++++ builtin/bugreport.c | 4 ++++ 3 files changed, 54 insertions(+)