Message ID | 20190815023418.33407-1-emilyshaffer@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | bugreport: add tool to generate debugging info | expand |
On 8/14/2019 10:34 PM, Emily Shaffer wrote: > Make it easier for users who encounter a bug to send a report by > collecting some state information, intended to be viewed by humans > familiar with Git. This is an excellent idea! VFS for Git has a similar "diagnose" command that collects logs, config, and other details (like packfile sizes and loose object counts). That has been a critical tool for supporting users. > Teach Git how to prompt the user for a good bug report: reproduction > steps, expected behavior, and actual behavior. Also, teach Git to write > down its own version, the version of some of its dependencies, the > operating system information, the effective gitconfig, the configured > hooks, and the contents of $GIT_DIR/logs. Finally, make sure Git asks > the user to review the contents of the report after it's generated. > > If users can send us a well-written bug report complete with system > information, a remote we may be able to clone, and a reflog showing us > where the problem occurred, we can reduce the number of questions and > answers which must travel between the reporter and the Git contributor. > > Users may also wish to send a report like this to their local "Git > expert" if they have put their repository into a state they are confused > by. > > Signed-off-by: Emily Shaffer <emilyshaffer@google.com> > --- > There are some things I'm not certain about from this patch, I'd > appreciate feedback. > > - Do we want to advertise the Git mailing list for bug reports? That is possible. Isn't there another mailing list for git users? I could see a patch added on top of this for git-for-windows/git that changes the instructions to create issues on GitHub. > - Which config options should we filter to avoid accidentally receiving > credentials? The remote URLs are pretty sensitive. Not only do users sometimes put passwords or PATs into their URLs, the literal name of the repo could be a secret. > - At the moment, the test is trying to check "everything we thought we > would populate got populated" - that seemed to me like it would hold > up best to changes to the set of information being collected. But > maybe there's something more robust we can do. > > And of course, if there are suggestions for other things to include in > the report I'm interested in adding. > > An example of the output can be found here: > https://gist.github.com/nasamuffin/2e320892f5c2147e829cbcf5bd0759a2 > > Thanks. > - Emily > > .gitignore | 1 + > Documentation/git-bugreport.txt | 48 ++++++++++++++++++ > Makefile | 1 + > command-list.txt | 1 + > git-bugreport.sh | 86 +++++++++++++++++++++++++++++++++ > t/t0091-bugreport.sh | 41 ++++++++++++++++ > 6 files changed, 178 insertions(+) > create mode 100644 Documentation/git-bugreport.txt > create mode 100755 git-bugreport.sh > create mode 100755 t/t0091-bugreport.sh > > diff --git a/.gitignore b/.gitignore > index 521d8f4fb4..b4f5433084 100644 > --- a/.gitignore > +++ b/.gitignore > @@ -25,6 +25,7 @@ > /git-bisect--helper > /git-blame > /git-branch > +/git-bugreport > /git-bundle > /git-cat-file > /git-check-attr > diff --git a/Documentation/git-bugreport.txt b/Documentation/git-bugreport.txt > new file mode 100644 > index 0000000000..c5f45bbee8 > --- /dev/null > +++ b/Documentation/git-bugreport.txt > @@ -0,0 +1,48 @@ > +git-bugreport(1) > +================ > + > +NAME > +---- > +git-bugreport - Collect information for user to file a bug report > + > +SYNOPSIS > +-------- > +[verse] > +'git bugreport' [-o | --output <path>] > + > +DESCRIPTION > +----------- > +Captures information about the user's machine, Git client, and repository state, > +as well as a form requesting information about the behavior the user observed, > +into a single text file which the user can then share, for example to the Git > +mailing list, in order to report an observed bug. > + > +The following information is requested from the user: > + > + - Reproduction steps > + - Expected behavior > + - Actual behavior > + > +The following information is captured automatically: > + > + - Git version (`git version --build-options`) > + - Machine information (`uname -a`) > + - Versions of various dependencies > + - Git config contents (`git config --show-origin --list`) > + - The contents of all configured git-hooks in `.git/hooks/` > + - The contents of `.git/logs` > + > +OPTIONS > +------- > +-o [<path>]:: > +--output [<path>]:: > + Place the resulting bug report file in <path> instead of the root of the > + Git repository. > + > +NOTE > +---- > +Bug reports can be sent to git@vger.kernel.org. > + > +GIT > +--- > +Part of the linkgit:git[1] suite > diff --git a/Makefile b/Makefile > index f9255344ae..69801a1c45 100644 > --- a/Makefile > +++ b/Makefile > @@ -606,6 +606,7 @@ TEST_PROGRAMS_NEED_X = > unexport CDPATH > > SCRIPT_SH += git-bisect.sh > +SCRIPT_SH += git-bugreport.sh > SCRIPT_SH += git-difftool--helper.sh > SCRIPT_SH += git-filter-branch.sh > SCRIPT_SH += git-merge-octopus.sh > diff --git a/command-list.txt b/command-list.txt > index a9ac72bef4..be5a605047 100644 > --- a/command-list.txt > +++ b/command-list.txt > @@ -54,6 +54,7 @@ git-archive mainporcelain > git-bisect mainporcelain info > git-blame ancillaryinterrogators complete > git-branch mainporcelain history > +git-bugreport ancillaryinterrogators > git-bundle mainporcelain > git-cat-file plumbinginterrogators > git-check-attr purehelpers > diff --git a/git-bugreport.sh b/git-bugreport.sh > new file mode 100755 > index 0000000000..2200703a51 > --- /dev/null > +++ b/git-bugreport.sh At first I was alarmed by "What? another shell script?" but this command should prioritize flexibility and extensibility over speed. Running multiple processes shouldn't be too taxing for what we are trying to do here. > @@ -0,0 +1,86 @@ > +#!/bin/sh > + > +print_filenames_and_content() { > +while read -r line; do > + echo "$line"; > + echo "========"; > + cat "$line"; > + echo; > +done > +} > + > +generate_report_text() { > + > + # Generate a form for the user to fill out with their issue. > + gettextln "Thank you for filling out a Git bug report!" > + gettextln "Please answer the following questions to help us understand your issue." > + echo > + gettextln "What did you do before the bug happened? (Steps to reproduce your issue)" > + echo > + gettextln "What did you expect to happen? (Expected behavior)" > + echo > + gettextln "What happened instead? (Actual behavior)" > + echo > + gettextln "Anything else you want to add:" > + echo > + gettextln "Please review the rest of the bug report below." > + gettextln "You can delete any lines you don't wish to send." > + echo > + > + echo "[System Information]" > + git version --build-options > + uname -a > + curl-config --version > + ldd --version > + echo > + > + echo "[Git Config]" > + # TODO: Pass this through grep -v to avoid users sending us their credentials. > + git config --show-origin --list > + echo Config options to consider stripping out: *url* *pass* (anything "password" but also "sendmail.smtppass") > + echo "[Configured Hooks]" > + find "$GIT_DIR/hooks/" -type f | grep -v "\.sample$" | print_filenames_and_content > + echo Remove the sample hooks, but focus on the others. Will this look like garbage if a hook is a binary file? > + > + echo "[Git Logs]" > + find "$GIT_DIR/logs" -type f | print_filenames_and_content > + echo As mentioned before, I've sometimes found it helpful to know the data shape for the object store. Having a few extra steps such as the following could be nice: echo "[Loose Objects]" for objdir in $(find "$GIT_DIR/objects/??" -type d) do echo "$objdir: $(ls $objdir | wc -l)" done echo echo "[Pack Data]" ls -l "$GIT_DIR/objects/pack" echo echo "[Object Info Data]" ls -lR "$GIT_DIR/objects/info" echo echo "[Alternates File]" echo "========" cat "$GIT_DIR/objects/info/alternates" echo That last one will collect information on the commit-graph file, even if it is an incremental file. > + > +} > + > +USAGE="[-o | --output <path>]" > + > +SUBDIRECTORY_OK=t > +OPTIONS_SPEC= > +. git-sh-setup > +. git-sh-i18n > + > +basedir="$PWD" > +while : > +do > + case "$1" in > + -o|--output) > + shift > + basedir="$1" > + shift > + continue > + ;; > + "") > + break > + ;; > + *) > + usage > + ;; > + esac > +done > + > + > +# Create bugreport file > +BUGREPORT_FILE="$basedir/git-bugreport-$(whoami)-$(hostname)-$(date -Iminutes)" > + > +generate_report_text >$BUGREPORT_FILE > + > +git_editor $BUGREPORT_FILE > + > +eval_gettextln "Your new bug report is in \$BUGREPORT_FILE." > diff --git a/t/t0091-bugreport.sh b/t/t0091-bugreport.sh > new file mode 100755 > index 0000000000..6eb2ee4f66 > --- /dev/null > +++ b/t/t0091-bugreport.sh > @@ -0,0 +1,41 @@ > +#!/bin/bash > + > +test_description='git bugreport' > + > +. ./test-lib.sh > + > +# Headers "[System Info]" will be followed by a non-empty line if we put some > +# information there; we can make sure all our headers were followed by some > +# information to check if the command was successful. > +HEADER_PATTERN="^\[.*\]$" > +check_all_headers_populated() { > + while read -r line; do > + if [$(grep $HEADER_PATTERN $line)]; then > + read -r nextline > + if [-z $nextline]; then > + return 1; > + fi > + fi > + done > +} > + > +test_expect_success 'creates a report with content in the right places' ' > + git bugreport && > + check_all_headers_populated <git-bugreport-* && > + rm git-bugreport-* > +' > + > +test_expect_success '--output puts the report in the provided dir' ' > + mkdir foo/ && > + git bugreport -o foo/ && > + test -f foo/git-bugreport-* && > + rm -fr foo/ > +' > + > +test_expect_success 'incorrect arguments abort with usage' ' > + test_must_fail git bugreport --false 2>output && > + grep usage output && > + test ! -f git-bugreport-* > +' > + > +test_done > I think this is a great start, and I'll take some time later to try it out. Thanks, -Stolee
Derrick Stolee <stolee@gmail.com> writes: > Config options to consider stripping out: > > *url* > *pass* (anything "password" but also "sendmail.smtppass") Blacklisting? I wonder if users feel safer if these are limited to known-benign ones. >> + echo "[Configured Hooks]" >> + find "$GIT_DIR/hooks/" -type f | grep -v "\.sample$" | print_filenames_and_content >> + echo > > Remove the sample hooks, but focus on the others. Will this look like garbage if a hook > is a binary file? This makes me feel very nervous. $GIT_DIR/hooks/ are private and people can hardcode credentials in them; $GIT_DIR/hooks/pre-foo may be written toread from $GIT_DIR/hooks/mypassword with the knowledge that there won't be any "mypassword" hook.
Emily Shaffer <emilyshaffer@google.com> writes: > diff --git a/Documentation/git-bugreport.txt b/Documentation/git-bugreport.txt > new file mode 100644 > index 0000000000..c5f45bbee8 > --- /dev/null > +++ b/Documentation/git-bugreport.txt > @@ -0,0 +1,48 @@ > +git-bugreport(1) > +================ > + > +NAME > +---- > +git-bugreport - Collect information for user to file a bug report > + > +SYNOPSIS > +-------- > +[verse] > +'git bugreport' [-o | --output <path>] > + > +DESCRIPTION > +----------- > +Captures information about the user's machine, Git client, and repository state, > +as well as a form requesting information about the behavior the user observed, > +into a single text file which the user can then share, for example to the Git > +mailing list, in order to report an observed bug. > + > +The following information is requested from the user: > + > + - Reproduction steps > + - Expected behavior > + - Actual behavior + - How the above two are different It is often helpful to have users explain how the expected and actual are different in their own words. > +NOTE > +---- > +Bug reports can be sent to git@vger.kernel.org. I am not sure if this belongs here. > diff --git a/git-bugreport.sh b/git-bugreport.sh > new file mode 100755 > index 0000000000..2200703a51 > --- /dev/null > +++ b/git-bugreport.sh > @@ -0,0 +1,86 @@ > +#!/bin/sh > + > +print_filenames_and_content() { > +while read -r line; do > + echo "$line"; > + echo "========"; > + cat "$line"; > + echo; > +done > +} Style. - have SP on both sides of () - one more HT indent for the function body - "do" on its own line - no unnecessary semicolons when LF would do You probably are better off asking xargs to do this instead of relying on "read -r". find ... | xargs -n 1 sh -c 'echo "$1" && cat "$1"' - or something like that, perhaps. > + > +generate_report_text() { > + > + # Generate a form for the user to fill out with their issue. > + gettextln "Thank you for filling out a Git bug report!" > + gettextln "Please answer the following questions to help us understand your issue." > + echo > + gettextln "What did you do before the bug happened? (Steps to reproduce your issue)" > + echo > + gettextln "What did you expect to happen? (Expected behavior)" > + echo > + gettextln "What happened instead? (Actual behavior)" > + echo > + gettextln "Anything else you want to add:" > + echo > + gettextln "Please review the rest of the bug report below." > + gettextln "You can delete any lines you don't wish to send." > + echo Would we on the receiving end be able to tell these section headers in translated to 47 different languages? I am sure that i18n is used here to encourage non-C-locale users to file bugs in their own languages, but are we prepared to react to them? > + echo "[System Information]" > + git version --build-options > + uname -a > + curl-config --version > + ldd --version "curl-config: command not found" may be clear enough, but would there be a case where errors from two or more consecutive commands in the above sequence would make the output confusing to the person sitting on the receiving end? Would it help, as a convention, to always ahve "echo [what am I reporting]" before each of these commands? > +# Create bugreport file > +BUGREPORT_FILE="$basedir/git-bugreport-$(whoami)-$(hostname)-$(date -Iminutes)" How portable is -Iminutes? > diff --git a/t/t0091-bugreport.sh b/t/t0091-bugreport.sh > new file mode 100755 > index 0000000000..6eb2ee4f66 > --- /dev/null > +++ b/t/t0091-bugreport.sh > @@ -0,0 +1,41 @@ > +#!/bin/bash Make sure /bin/sh suffices to run the test script. > +test_description='git bugreport' > + > +. ./test-lib.sh > + > +# Headers "[System Info]" will be followed by a non-empty line if we put some > +# information there; we can make sure all our headers were followed by some > +# information to check if the command was successful. > +HEADER_PATTERN="^\[.*\]$" > +check_all_headers_populated() { > + while read -r line; do > + if [$(grep $HEADER_PATTERN $line)]; then Documentation/CodingGuidelines I am not sure if the traits this test script checks about the contents of the output is all that interesting. Whenever we add new sections to the main command because we want other kinds of information collected, we'd update this test script because otherwise the test would fail, but would it result in quality bugreport tool, or is it just additional busywork? If we decide later that a header and its body needs to be separated with a blank line for better readablity, the check done here would also need to be updated, but again, that does not feel like anything more than just busywork to me. The tests for "-o" and unknown options do make sense, though. Thanks. > +# Headers "[System Info]" will be followed by a non-empty line if we put some > +# information there; we can make sure all our headers were followed by some > +# information to check if the command was successful. > +HEADER_PATTERN="^\[.*\]$" > +check_all_headers_populated() { > + while read -r line; do > + if [$(grep $HEADER_PATTERN $line)]; then > + read -r nextline > + if [-z $nextline]; then > + return 1; > + fi > + fi > + done > +} > + > +test_expect_success 'creates a report with content in the right places' ' > + git bugreport && > + check_all_headers_populated <git-bugreport-* && > + rm git-bugreport-* > +' > + > +test_expect_success '--output puts the report in the provided dir' ' > + mkdir foo/ && > + git bugreport -o foo/ && > + test -f foo/git-bugreport-* && > + rm -fr foo/ > +' > + > +test_expect_success 'incorrect arguments abort with usage' ' > + test_must_fail git bugreport --false 2>output && > + grep usage output && > + test ! -f git-bugreport-* > +' > + > +test_done
Hi, On Thu, 15 Aug 2019, Derrick Stolee wrote: > On 8/14/2019 10:34 PM, Emily Shaffer wrote: > > diff --git a/git-bugreport.sh b/git-bugreport.sh > > new file mode 100755 > > index 0000000000..2200703a51 > > --- /dev/null > > +++ b/git-bugreport.sh > > At first I was alarmed by "What? another shell script?" but this > command should prioritize flexibility and extensibility over speed. > Running multiple processes shouldn't be too taxing for what we are > trying to do here. Git for Windows sometimes receives bug reports about Bash not being able to start (usually it is a DLL base address problem, related to the way Cygwin and MSYS2 emulate `fork()`). In such a case, `git bugreport` would only be able to offer a reason for yet another bug report instead of adding useful metadata. Something to keep in mind when deciding how to implement this command. Ciao, Dscho
On Thu, Aug 15, 2019 at 10:15:24AM -0400, Derrick Stolee wrote: > > - Do we want to advertise the Git mailing list for bug reports? > > That is possible. Isn't there another mailing list for git users? I know there's an IRC channel for Git users, I dunno about mailing list. I'm worried that places I name as points of contact will grow stale, although I suppose #git on freenode isn't really going anywhere, for example. (But as a counterexample, I hope that nobody sends something like this to #git-devel, which seems to have a lower population than this mailing list.) > > I could see a patch added on top of this for git-for-windows/git that > changes the instructions to create issues on GitHub. Indeed; I imagine we'd probably like to patch it to ask for bugs in our bug tracker. > > > - Which config options should we filter to avoid accidentally receiving > > credentials? > > The remote URLs are pretty sensitive. Not only do users sometimes put passwords > or PATs into their URLs, the literal name of the repo could be a secret. Now here's where I start to wonder. We often debug internally by asking for the remote URL and replicating the issue there, which is why I mentioned it explicitly in the commit message. But I hadn't considered folks including the password in the URL. Well, I suppose anybody can keep a local patch to change the config filter pattern. I'll try to make it easy to spot and modify. [snip] > At first I was alarmed by "What? another shell script?" but this command should > prioritize flexibility and extensibility over speed. Running multiple processes > shouldn't be too taxing for what we are trying to do here. If shell scripts are entirely deprecated I can convert it, but doing it in C seemed like overkill when I really just wanted "what are all the commands we would ask the user to run and tell us the output?". I figured also that it would be a little more immune to bitrot to output the contents of porcelain commands here. > > + echo "[Git Config]" > > + # TODO: Pass this through grep -v to avoid users sending us their credentials. > > + git config --show-origin --list > > + echo > > Config options to consider stripping out: > > *url* > *pass* (anything "password" but also "sendmail.smtppass") Done, thanks. > > > + echo "[Configured Hooks]" > > + find "$GIT_DIR/hooks/" -type f | grep -v "\.sample$" | print_filenames_and_content > > + echo > > Remove the sample hooks, but focus on the others. Will this look like garbage if a hook > is a binary file? Yeah, I'm sure it will. I'll add a check to print_filenames_and_content() so it can tell us if there is a hook installed there, even if we can't see the content. > > > + > > + echo "[Git Logs]" > > + find "$GIT_DIR/logs" -type f | print_filenames_and_content > > + echo > > As mentioned before, I've sometimes found it helpful to know the data shape for the object > store. Having a few extra steps such as the following could be nice: > > echo "[Loose Objects]" > for objdir in $(find "$GIT_DIR/objects/??" -type d) > do > echo "$objdir: $(ls $objdir | wc -l)" `echo "$objdir: $(ls $objdir | wc -l) objects` I'll add context so we don't need to have the bugreport script memorized in order to read a bugreport :) > done > echo > > echo "[Pack Data]" > ls -l "$GIT_DIR/objects/pack" > echo > > echo "[Object Info Data]" > ls -lR "$GIT_DIR/objects/info" > echo > > echo "[Alternates File]" > echo "========" > cat "$GIT_DIR/objects/info/alternates" > echo > > That last one will collect information on the commit-graph file, even if it is an > incremental file. Thanks Stolee, these are awesome and exactly the kind of feedback I was hoping for. > > I think this is a great start, and I'll take some time later to try it out. > > Thanks, > -Stolee Awesome. I'm excited to hear how it goes. - Emily
On Thu, Aug 15, 2019 at 11:10:12AM -0700, Junio C Hamano wrote: > Emily Shaffer <emilyshaffer@google.com> writes: > > > diff --git a/Documentation/git-bugreport.txt b/Documentation/git-bugreport.txt > > new file mode 100644 > > index 0000000000..c5f45bbee8 > > --- /dev/null > > +++ b/Documentation/git-bugreport.txt > > @@ -0,0 +1,48 @@ > > +git-bugreport(1) > > +================ > > + > > +NAME > > +---- > > +git-bugreport - Collect information for user to file a bug report > > + > > +SYNOPSIS > > +-------- > > +[verse] > > +'git bugreport' [-o | --output <path>] > > + > > +DESCRIPTION > > +----------- > > +Captures information about the user's machine, Git client, and repository state, > > +as well as a form requesting information about the behavior the user observed, > > +into a single text file which the user can then share, for example to the Git > > +mailing list, in order to report an observed bug. > > + > > +The following information is requested from the user: > > + > > + - Reproduction steps > > + - Expected behavior > > + - Actual behavior > > + - How the above two are different > > It is often helpful to have users explain how the expected and > actual are different in their own words. Good point; added. > > > +NOTE > > +---- > > +Bug reports can be sent to git@vger.kernel.org. > > I am not sure if this belongs here. Sure, I wasn't certain either. Would you rather I remove the "what to do with this bugreport" NOTE section entirely? I worry that folks will think this generated text file is being magically reported behind the scenes, or that they won't know what to do with the file now that it's created. I suppose I did mention in the DESCRIPTION section that the user can share the generated text file to report a bug. Maybe I should remove "for example to the Git ML" from there as well. > > > diff --git a/git-bugreport.sh b/git-bugreport.sh > > new file mode 100755 > > index 0000000000..2200703a51 > > --- /dev/null > > +++ b/git-bugreport.sh > > @@ -0,0 +1,86 @@ > > +#!/bin/sh > > + > > +print_filenames_and_content() { > > +while read -r line; do > > + echo "$line"; > > + echo "========"; > > + cat "$line"; > > + echo; > > +done > > +} > > Style. > > - have SP on both sides of () > - one more HT indent for the function body > - "do" on its own line > - no unnecessary semicolons when LF would do > > You probably are better off asking xargs to do this instead of > relying on "read -r". > > find ... | xargs -n 1 sh -c 'echo "$1" && cat "$1"' - > > or something like that, perhaps. Hm. In responding to Stolee's comments, I ended up replacing "cat $line" with another function, and it seems xargs isn't happy directing to functions which haven't been exported, because it spins a new process for each argument it's passed. I had originally intended to do this with xargs but then didn't want to pack many lines into one xargs call. The way the code is in PS1 I think it's a reasonable replacement, but the code in PS2 is less so - there's a check for whether a file given is binary or whether it exists at all (the former for hooks, which I think you were worried about, and the latter for the alternates file, which I found I do not have.) Why is "read -r" unreliable? Is it better to export a function and use xargs, compared to using "read -r" in a loop like this? I was worried that exporting a function is similar to namespace pollution in C++, but maybe I'm mistaken. > > > + > > +generate_report_text() { > > + > > + # Generate a form for the user to fill out with their issue. > > + gettextln "Thank you for filling out a Git bug report!" > > + gettextln "Please answer the following questions to help us understand your issue." > > + echo > > + gettextln "What did you do before the bug happened? (Steps to reproduce your issue)" > > + echo > > + gettextln "What did you expect to happen? (Expected behavior)" > > + echo > > + gettextln "What happened instead? (Actual behavior)" > > + echo > > + gettextln "Anything else you want to add:" > > + echo > > + gettextln "Please review the rest of the bug report below." > > + gettextln "You can delete any lines you don't wish to send." > > + echo > > Would we on the receiving end be able to tell these section headers > in translated to 47 different languages? I am sure that i18n is > used here to encourage non-C-locale users to file bugs in their own > languages, but are we prepared to react to them? I hope that we could accept reports in many languages with online translation as a guide, but practically speaking, machine-learning-powered online translation may not be as good for technical topics as it is for conversational. At the same time, it seems unnecessarily restrictive to ask for bugs to come in English. My heart thinks it's better to encourage other languages too. > > > + echo "[System Information]" > > + git version --build-options > > + uname -a > > + curl-config --version > > + ldd --version > > "curl-config: command not found" may be clear enough, but would > there be a case where errors from two or more consecutive commands > in the above sequence would make the output confusing to the person > sitting on the receiving end? Would it help, as a convention, to > always ahve "echo [what am I reporting]" before each of these > commands? Yeah, I think that's a good point. In responding to Stolee's review, I mentioned adding additional context to some command's output so that "those reviewing the report don't need to be intimately familiar with what the bugreport script runs". But it might actually be most useful to echo the command that's being run everywhere. I imagine it's fine to include the [] header as well, to make it easier for the user to understand what was generated. > > > +# Create bugreport file > > +BUGREPORT_FILE="$basedir/git-bugreport-$(whoami)-$(hostname)-$(date -Iminutes)" > > How portable is -Iminutes? Ah, it seems while it works in MingW and bash, it doesn't work in zsh. I'll figure something else out. Thanks. > > > diff --git a/t/t0091-bugreport.sh b/t/t0091-bugreport.sh > > new file mode 100755 > > index 0000000000..6eb2ee4f66 > > --- /dev/null > > +++ b/t/t0091-bugreport.sh > > @@ -0,0 +1,41 @@ > > +#!/bin/bash > > Make sure /bin/sh suffices to run the test script. Will do. Thanks. > > > +test_description='git bugreport' > > + > > +. ./test-lib.sh > > + > > +# Headers "[System Info]" will be followed by a non-empty line if we put some > > +# information there; we can make sure all our headers were followed by some > > +# information to check if the command was successful. > > +HEADER_PATTERN="^\[.*\]$" > > +check_all_headers_populated() { > > + while read -r line; do > > + if [$(grep $HEADER_PATTERN $line)]; then > > Documentation/CodingGuidelines > > I am not sure if the traits this test script checks about the > contents of the output is all that interesting. Whenever we add new > sections to the main command because we want other kinds of > information collected, we'd update this test script because > otherwise the test would fail, but would it result in quality > bugreport tool, or is it just additional busywork? I'm a little confused about this - I checked for a general [...] header standing alone on a line specifically so that we wouldn't need to update the test script when adding a new section. What do you see that will grow stale? > > If we decide later that a header and its body needs to be separated > with a blank line for better readablity, the check done here would > also need to be updated, but again, that does not feel like anything > more than just busywork to me. Sure, I agree with that. So, what's your suggestion? Not to check the output at all? (This may actually be fine; it occurred to me while reading your review that if a user is filing a bug report about something, one of the diagnostic commands in bugreport might be what's broken for them. So perhaps it should be tolerant to missing information...) > > The tests for "-o" and unknown options do make sense, though. > > Thanks. Thanks for reviewing. - Emily
On Thu, Aug 15, 2019 at 10:07:57PM +0200, Johannes Schindelin wrote: > Hi, > > On Thu, 15 Aug 2019, Derrick Stolee wrote: > > > On 8/14/2019 10:34 PM, Emily Shaffer wrote: > > > diff --git a/git-bugreport.sh b/git-bugreport.sh > > > new file mode 100755 > > > index 0000000000..2200703a51 > > > --- /dev/null > > > +++ b/git-bugreport.sh > > > > At first I was alarmed by "What? another shell script?" but this > > command should prioritize flexibility and extensibility over speed. > > Running multiple processes shouldn't be too taxing for what we are > > trying to do here. > > Git for Windows sometimes receives bug reports about Bash not being able > to start (usually it is a DLL base address problem, related to the way > Cygwin and MSYS2 emulate `fork()`). Hmm. In a case like this, then, how is someone using GfW at all? Embarrassingly, I don't actually have a way to try it out for myself. It seems there's no GUI, and users are using it through the command line in mingw, so when you say "bash doesn't start" do you mean "users can't use the mingw prompt at all"? > > In such a case, `git bugreport` would only be able to offer a reason for > yet another bug report instead of adding useful metadata. > > Something to keep in mind when deciding how to implement this command. > > Ciao, > Dscho Yeah, that's an interesting point, thanks for bringing it up. So, in the case when bash won't start at all, what does that failure look like? How much of Git can users still use? For example, would they be able to get far enough to run a binary git-bugreport? - Emily
Emily Shaffer <emilyshaffer@google.com> writes: >> > +NOTE >> > +---- >> > +Bug reports can be sent to git@vger.kernel.org. >> >> I am not sure if this belongs here. > > Sure, I wasn't certain either. Would you rather I remove the "what to do > with this bugreport" NOTE section entirely? Not really. You are invoking an editor to let the user edit the pre-populated report, and I would imagine that a comment in that file would be the best place to give instructions, not a manpage for "git bugreport" command. > So, what's your suggestion? Not to check the output at all? (This may > actually be fine; it occurred to me while reading your review that if a > user is filing a bug report about something, one of the diagnostic > commands in bugreport might be what's broken for them. So perhaps it > should be tolerant to missing information...) Right.
On Thu, Aug 15, 2019 at 07:36:57AM -0700, Junio C Hamano wrote: > Derrick Stolee <stolee@gmail.com> writes: > > > Config options to consider stripping out: > > > > *url* > > *pass* (anything "password" but also "sendmail.smtppass") > > Blacklisting? I wonder if users feel safer if these are limited to > known-benign ones. I think a whitelist of config options to print would grow stale immediately, and the options we're missing would be very likely to be configs to turn on new experimental features - which is probably what we most want the bugreport for. > > >> + echo "[Configured Hooks]" > >> + find "$GIT_DIR/hooks/" -type f | grep -v "\.sample$" | print_filenames_and_content > >> + echo > > > > Remove the sample hooks, but focus on the others. Will this look like garbage if a hook > > is a binary file? > > This makes me feel very nervous. $GIT_DIR/hooks/ are private and > people can hardcode credentials in them; $GIT_DIR/hooks/pre-foo may > be written toread from $GIT_DIR/hooks/mypassword with the knowledge > that there won't be any "mypassword" hook. Hmm. I think the list of valid hooks isn't one that changes often, but it's also not enumerated in some machine-parseable way - it exists in Documentation/githooks.txt but that's all. I'd still be a little worried about bitrot... I think it's probably better to list the filenames in $GIT_DIR/hooks but not print their contents. I'll modify it. - Emily
On Thu, Aug 15, 2019 at 03:29:24PM -0700, Junio C Hamano wrote: > Emily Shaffer <emilyshaffer@google.com> writes: > > >> > +NOTE > >> > +---- > >> > +Bug reports can be sent to git@vger.kernel.org. > >> > >> I am not sure if this belongs here. > > > > Sure, I wasn't certain either. Would you rather I remove the "what to do > > with this bugreport" NOTE section entirely? > > Not really. You are invoking an editor to let the user edit the > pre-populated report, and I would imagine that a comment in that > file would be the best place to give instructions, not a manpage > for "git bugreport" command. Oh, I see! In that case, do you still want the Git mailing list address shown in the report text? > > > So, what's your suggestion? Not to check the output at all? (This may > > actually be fine; it occurred to me while reading your review that if a > > user is filing a bug report about something, one of the diagnostic > > commands in bugreport might be what's broken for them. So perhaps it > > should be tolerant to missing information...) > > Right. Ok, will do.
Emily Shaffer <emilyshaffer@google.com> writes: > I think a whitelist of config options to print would grow stale > immediately, and the options we're missing would be very likely to be > configs to turn on new experimental features - which is probably what we > most want the bugreport for. The implementation of your "git bugreport" command comes from the same version of Git as such an experimental feature you want feedback to is shipped with, so I am not sure where your concern about staleness comes from. If you really care about getting quality reports, you need to earn users' trust (one way of doing so would be to maintain a good "list of benign configuration knobs whose values help diagnosing issues"), and you need to make sure we obtain relevant pieces of information. Both of these things are something you need to actively work on. We have trained ourselves to the point that we consider it a bug if you add a new command git-bugreport without adding a corresponding pattern in the .gitignore file. And thanks to that discipline, we have been fairly good at keeping .gitignore up to date. I do not see a reason why we cannot do something similar to the registry of configuration variables that we care about, which may be shipped as part of "git bugreport" command.
On Thu, Aug 15, 2019 at 04:40:50PM -0700, Junio C Hamano wrote: > Emily Shaffer <emilyshaffer@google.com> writes: > > > I think a whitelist of config options to print would grow stale > > immediately, and the options we're missing would be very likely to be > > configs to turn on new experimental features - which is probably what we > > most want the bugreport for. > > The implementation of your "git bugreport" command comes from the > same version of Git as such an experimental feature you want > feedback to is shipped with, so I am not sure where your concern > about staleness comes from. > > If you really care about getting quality reports, you need to earn > users' trust (one way of doing so would be to maintain a good "list > of benign configuration knobs whose values help diagnosing issues"), > and you need to make sure we obtain relevant pieces of information. > Both of these things are something you need to actively work on. > > We have trained ourselves to the point that we consider it a bug if > you add a new command git-bugreport without adding a corresponding > pattern in the .gitignore file. And thanks to that discipline, we > have been fairly good at keeping .gitignore up to date. I do not > see a reason why we cannot do something similar to the registry of > configuration variables that we care about, which may be shipped as > part of "git bugreport" command. I think comparing this habit to the .gitignore isn't quite fair - .gitignore tells me I forgot to add my new command binary to it, when I run `git status` to see what I need to add to my commit with new command. But, I agree that it's not going to be possible to create a completely leakproof blacklisting heuristic here. So I'll use a whitelist for the next patchset. - Emily
Emily Shaffer <emilyshaffer@google.com> writes: > I think comparing this habit to the .gitignore isn't quite fair - > .gitignore tells me I forgot to add my new command binary to it, when I > run `git status` to see what I need to add to my commit with new > command. That is why I said that we need to actively work on, if we care about getting quality reports. I do not think it is unreasonable to expect the build procedure for "git bugreport" to involve scanning in Documentation/config/ to pick up variable names, annotated in such a way that is invisible to AsciiDoc to allow us tell which ones are sensitive and which ones are not. A test in t/ could even check if a documented configuration variable has such an annotation. A commit that adds configuration variables without documentiong them does exist, but variables without documentation are (1) bugs, and (2) are not worth serious engineering effort on until they get documented. I am not saying that you must implement the whitelist exactly like the above. I am not even saying that this must be whitelist and not blacklist---you'd have the same issue maintaining the list anyway. The above is merely to illustrate the reason why I think that the kind of "active work" to make sure that the list will not go stale would be feasible. Thanks.
On Fri, Aug 16, 2019 at 09:41:41AM -0700, Junio C Hamano wrote: > Emily Shaffer <emilyshaffer@google.com> writes: > > > I think comparing this habit to the .gitignore isn't quite fair - > > .gitignore tells me I forgot to add my new command binary to it, when I > > run `git status` to see what I need to add to my commit with new > > command. > > That is why I said that we need to actively work on, if we care > about getting quality reports. > > I do not think it is unreasonable to expect the build procedure for > "git bugreport" to involve scanning in Documentation/config/ to pick > up variable names, annotated in such a way that is invisible to > AsciiDoc to allow us tell which ones are sensitive and which ones > are not. A test in t/ could even check if a documented > configuration variable has such an annotation. A commit that adds > configuration variables without documentiong them does exist, but > variables without documentation are (1) bugs, and (2) are not worth > serious engineering effort on until they get documented. Interesting. I think I have an idea for a way to do this, but it ends up fairly large; I'll send proof of concept as a follow-on patch to this one. Thanks for the suggestions. - Emily
Hi Emily, On Thu, 15 Aug 2019, Emily Shaffer wrote: > On Thu, Aug 15, 2019 at 10:07:57PM +0200, Johannes Schindelin wrote: > > > > On Thu, 15 Aug 2019, Derrick Stolee wrote: > > > > > On 8/14/2019 10:34 PM, Emily Shaffer wrote: > > > > diff --git a/git-bugreport.sh b/git-bugreport.sh > > > > new file mode 100755 > > > > index 0000000000..2200703a51 > > > > --- /dev/null > > > > +++ b/git-bugreport.sh > > > > > > At first I was alarmed by "What? another shell script?" but this > > > command should prioritize flexibility and extensibility over speed. > > > Running multiple processes shouldn't be too taxing for what we are > > > trying to do here. > > > > Git for Windows sometimes receives bug reports about Bash not being able > > to start (usually it is a DLL base address problem, related to the way > > Cygwin and MSYS2 emulate `fork()`). > > Hmm. In a case like this, then, how is someone using GfW at all? On Windows, there are native command-line interpreters available that are _not_ Bash: PowerShell and CMD. These days, you can get away with using Git _without_ a working Unix shell most of the time. There are only preciously few commands left that are still written as scripts, and most of these seem to be used less often than one might think: - submodule - bisect - filter-branch - instaweb - mergetool - some uncommon merge strategies - rebase -p (already deprecated) - several CVS/Subversion/Arch/Quilt integrations - send-email (only relevant for mailing list centric projects, again, not very common any longer) - request-pull (I would not be surprised if this is specific for the Linux project, as if that project was even close to the main user of Git) So as long as you don't use submodules, and as long as you are unaware of the `bisect` command, which would comprise the majority of Git users in my experience, you can totally use Git for Windows without using Bash or Perl, i.e. without using the Cygwin/MSYS2 runtime that seems to be one of the common causes for trouble in Git for Windows. > Embarrassingly, I don't actually have a way to try it out for myself. > It seems there's no GUI, and users are using it through the command line > in mingw, so when you say "bash doesn't start" do you mean "users can't > use the mingw prompt at all"? There is no MinGW prompt. MinGW stands for "Minimal GNU on Windows", but it really refers to a way to compile native Win32 programs via GCC. Meaning that some of the POSIX functionality Unix/Linux developers have come to expect is unavailable. For example, `fork()`. Which means that there is no native port of Bash to Windows, at least not that _I_ am aware of. Git for Windows comes with a "Git Bash", which is essentially a Bash built against the MSYS2 runtime (i.e. it is much more like a Cygwin executable than like a Win32 executable). The common way to run Git for Windows is actually via `git.exe`, independent of what particular command-line interpreter you prefer. > > In such a case, `git bugreport` would only be able to offer a reason for > > yet another bug report instead of adding useful metadata. > > > > Something to keep in mind when deciding how to implement this command. > > Yeah, that's an interesting point, thanks for bringing it up. So, in the > case when bash won't start at all, what does that failure look like? How > much of Git can users still use? For example, would they be able to get > far enough to run a binary git-bugreport? See above. If I were you, I would not write `git bugreport` as anything but a built-in, written in C. Unless there are _really_ good reasons not to [*1*]. And quite honestly, I don't see any such reasons. Ciao, Dscho Footnote *1*: One good reason to use the MSYS2 Bash would be to be able to use MSYS2's POSIX emulation layer, e.g. to talk to the SSH agent (that is also an MSYS2 program, meaning that it communicates via emulated Unix sockets, which is a facility not available to native Win32 programs). Another valid reason to use the MSYS2 Bash is to be able to talk to the (emulated) pseudo TTYs provided by the MSYS2 runtime, e.g. when running inside a MinTTY window, which is the default for Git Bash in Git for Windows.
diff --git a/.gitignore b/.gitignore index 521d8f4fb4..b4f5433084 100644 --- a/.gitignore +++ b/.gitignore @@ -25,6 +25,7 @@ /git-bisect--helper /git-blame /git-branch +/git-bugreport /git-bundle /git-cat-file /git-check-attr diff --git a/Documentation/git-bugreport.txt b/Documentation/git-bugreport.txt new file mode 100644 index 0000000000..c5f45bbee8 --- /dev/null +++ b/Documentation/git-bugreport.txt @@ -0,0 +1,48 @@ +git-bugreport(1) +================ + +NAME +---- +git-bugreport - Collect information for user to file a bug report + +SYNOPSIS +-------- +[verse] +'git bugreport' [-o | --output <path>] + +DESCRIPTION +----------- +Captures information about the user's machine, Git client, and repository state, +as well as a form requesting information about the behavior the user observed, +into a single text file which the user can then share, for example to the Git +mailing list, in order to report an observed bug. + +The following information is requested from the user: + + - Reproduction steps + - Expected behavior + - Actual behavior + +The following information is captured automatically: + + - Git version (`git version --build-options`) + - Machine information (`uname -a`) + - Versions of various dependencies + - Git config contents (`git config --show-origin --list`) + - The contents of all configured git-hooks in `.git/hooks/` + - The contents of `.git/logs` + +OPTIONS +------- +-o [<path>]:: +--output [<path>]:: + Place the resulting bug report file in <path> instead of the root of the + Git repository. + +NOTE +---- +Bug reports can be sent to git@vger.kernel.org. + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/Makefile b/Makefile index f9255344ae..69801a1c45 100644 --- a/Makefile +++ b/Makefile @@ -606,6 +606,7 @@ TEST_PROGRAMS_NEED_X = unexport CDPATH SCRIPT_SH += git-bisect.sh +SCRIPT_SH += git-bugreport.sh SCRIPT_SH += git-difftool--helper.sh SCRIPT_SH += git-filter-branch.sh SCRIPT_SH += git-merge-octopus.sh diff --git a/command-list.txt b/command-list.txt index a9ac72bef4..be5a605047 100644 --- a/command-list.txt +++ b/command-list.txt @@ -54,6 +54,7 @@ git-archive mainporcelain git-bisect mainporcelain info git-blame ancillaryinterrogators complete git-branch mainporcelain history +git-bugreport ancillaryinterrogators git-bundle mainporcelain git-cat-file plumbinginterrogators git-check-attr purehelpers diff --git a/git-bugreport.sh b/git-bugreport.sh new file mode 100755 index 0000000000..2200703a51 --- /dev/null +++ b/git-bugreport.sh @@ -0,0 +1,86 @@ +#!/bin/sh + +print_filenames_and_content() { +while read -r line; do + echo "$line"; + echo "========"; + cat "$line"; + echo; +done +} + +generate_report_text() { + + # Generate a form for the user to fill out with their issue. + gettextln "Thank you for filling out a Git bug report!" + gettextln "Please answer the following questions to help us understand your issue." + echo + gettextln "What did you do before the bug happened? (Steps to reproduce your issue)" + echo + gettextln "What did you expect to happen? (Expected behavior)" + echo + gettextln "What happened instead? (Actual behavior)" + echo + gettextln "Anything else you want to add:" + echo + gettextln "Please review the rest of the bug report below." + gettextln "You can delete any lines you don't wish to send." + echo + + echo "[System Information]" + git version --build-options + uname -a + curl-config --version + ldd --version + echo + + echo "[Git Config]" + # TODO: Pass this through grep -v to avoid users sending us their credentials. + git config --show-origin --list + echo + + echo "[Configured Hooks]" + find "$GIT_DIR/hooks/" -type f | grep -v "\.sample$" | print_filenames_and_content + echo + + echo "[Git Logs]" + find "$GIT_DIR/logs" -type f | print_filenames_and_content + echo + +} + +USAGE="[-o | --output <path>]" + +SUBDIRECTORY_OK=t +OPTIONS_SPEC= +. git-sh-setup +. git-sh-i18n + +basedir="$PWD" +while : +do + case "$1" in + -o|--output) + shift + basedir="$1" + shift + continue + ;; + "") + break + ;; + *) + usage + ;; + esac +done + + +# Create bugreport file +BUGREPORT_FILE="$basedir/git-bugreport-$(whoami)-$(hostname)-$(date -Iminutes)" + +generate_report_text >$BUGREPORT_FILE + +git_editor $BUGREPORT_FILE + +eval_gettextln "Your new bug report is in \$BUGREPORT_FILE." diff --git a/t/t0091-bugreport.sh b/t/t0091-bugreport.sh new file mode 100755 index 0000000000..6eb2ee4f66 --- /dev/null +++ b/t/t0091-bugreport.sh @@ -0,0 +1,41 @@ +#!/bin/bash + +test_description='git bugreport' + +. ./test-lib.sh + +# Headers "[System Info]" will be followed by a non-empty line if we put some +# information there; we can make sure all our headers were followed by some +# information to check if the command was successful. +HEADER_PATTERN="^\[.*\]$" +check_all_headers_populated() { + while read -r line; do + if [$(grep $HEADER_PATTERN $line)]; then + read -r nextline + if [-z $nextline]; then + return 1; + fi + fi + done +} + +test_expect_success 'creates a report with content in the right places' ' + git bugreport && + check_all_headers_populated <git-bugreport-* && + rm git-bugreport-* +' + +test_expect_success '--output puts the report in the provided dir' ' + mkdir foo/ && + git bugreport -o foo/ && + test -f foo/git-bugreport-* && + rm -fr foo/ +' + +test_expect_success 'incorrect arguments abort with usage' ' + test_must_fail git bugreport --false 2>output && + grep usage output && + test ! -f git-bugreport-* +' + +test_done
Make it easier for users who encounter a bug to send a report by collecting some state information, intended to be viewed by humans familiar with Git. Teach Git how to prompt the user for a good bug report: reproduction steps, expected behavior, and actual behavior. Also, teach Git to write down its own version, the version of some of its dependencies, the operating system information, the effective gitconfig, the configured hooks, and the contents of $GIT_DIR/logs. Finally, make sure Git asks the user to review the contents of the report after it's generated. If users can send us a well-written bug report complete with system information, a remote we may be able to clone, and a reflog showing us where the problem occurred, we can reduce the number of questions and answers which must travel between the reporter and the Git contributor. Users may also wish to send a report like this to their local "Git expert" if they have put their repository into a state they are confused by. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> --- There are some things I'm not certain about from this patch, I'd appreciate feedback. - Do we want to advertise the Git mailing list for bug reports? - Which config options should we filter to avoid accidentally receiving credentials? - At the moment, the test is trying to check "everything we thought we would populate got populated" - that seemed to me like it would hold up best to changes to the set of information being collected. But maybe there's something more robust we can do. And of course, if there are suggestions for other things to include in the report I'm interested in adding. An example of the output can be found here: https://gist.github.com/nasamuffin/2e320892f5c2147e829cbcf5bd0759a2 Thanks. - Emily .gitignore | 1 + Documentation/git-bugreport.txt | 48 ++++++++++++++++++ Makefile | 1 + command-list.txt | 1 + git-bugreport.sh | 86 +++++++++++++++++++++++++++++++++ t/t0091-bugreport.sh | 41 ++++++++++++++++ 6 files changed, 178 insertions(+) create mode 100644 Documentation/git-bugreport.txt create mode 100755 git-bugreport.sh create mode 100755 t/t0091-bugreport.sh