diff mbox series

[v3,1/2] builtin/shortlog: explicitly set hash algo when there is no repo

Message ID 20241016182124.48148-2-wolf@oriole.systems (mailing list archive)
State Superseded
Headers show
Series builtin/shortlog: explicitly set hash algo when there is no repo | expand

Commit Message

Wolfgang Müller Oct. 16, 2024, 6:21 p.m. UTC
Whilst git-shortlog(1) does not explicitly need any repository
information when run without reference to one, it still parses some of
its arguments with parse_revision_opt() which assumes that the hash
algorithm is set. However, in c8aed5e8da (repository: stop setting SHA1
as the default object hash, 2024-05-07) we stopped setting up a default
hash algorithm and instead require commands to set it up explicitly.

This was done for most other commands like in ab274909d4 (builtin/diff:
explicitly set hash algo when there is no repo, 2024-05-07) but was
missed for builtin/shortlog, making git-shortlog(1) segfault outside of
a repository when given arguments like --author that trigger a call to
parse_revision_opt().

Fix this for now by explicitly setting the hash algorithm to SHA1. Also
add a regression test for the segfault.

Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
Thanks-to: Eric Sunshine <sunshine@sunshineco.com>
---
 builtin/shortlog.c  | 12 ++++++++++++
 t/t4201-shortlog.sh |  4 ++++
 2 files changed, 16 insertions(+)

Comments

Taylor Blau Oct. 16, 2024, 7:22 p.m. UTC | #1
On Wed, Oct 16, 2024 at 08:21:23PM +0200, Wolfgang Müller wrote:
> Whilst git-shortlog(1) does not explicitly need any repository
> information when run without reference to one, it still parses some of
> its arguments with parse_revision_opt() which assumes that the hash
> algorithm is set. However, in c8aed5e8da (repository: stop setting SHA1
> as the default object hash, 2024-05-07) we stopped setting up a default
> hash algorithm and instead require commands to set it up explicitly.
>
> This was done for most other commands like in ab274909d4 (builtin/diff:
> explicitly set hash algo when there is no repo, 2024-05-07) but was
> missed for builtin/shortlog, making git-shortlog(1) segfault outside of
> a repository when given arguments like --author that trigger a call to
> parse_revision_opt().

Good analysis.

> Fix this for now by explicitly setting the hash algorithm to SHA1. Also
> add a regression test for the segfault.

Makes sense.

> Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
> Thanks-to: Eric Sunshine <sunshine@sunshineco.com>

In the future, please ensure that your Signed-off-by is the last trailer
in the section to indicate that you have certified everything above it
(which in that case would include your Thanks-to here).

> @@ -143,6 +143,10 @@ fuzz()
>  	test_grep "too many arguments" out
>  '
>
> +test_expect_success 'shortlog --author from non-git directory does not segfault' '
> +	echo | nongit git shortlog --author=author
> +'
> +

Instead of 'echo |', I wonder if it would be just as good to write this
test as:

    nongit git shortlog --author=author </dev/null

Thanks,
Taylor
Wolfgang Müller Oct. 16, 2024, 7:37 p.m. UTC | #2
> Instead of 'echo |', I wonder if it would be just as good to write this
> test as:
> 
>     nongit git shortlog --author=author </dev/null

Existing tests seem to prefer </dev/null overwhelmingly, so I think I'll
change that. Shouldn't make a difference to the test itself.
Patrick Steinhardt Oct. 17, 2024, 11:58 a.m. UTC | #3
On Wed, Oct 16, 2024 at 03:22:04PM -0400, Taylor Blau wrote:
> On Wed, Oct 16, 2024 at 08:21:23PM +0200, Wolfgang Müller wrote:
> > Whilst git-shortlog(1) does not explicitly need any repository
> > information when run without reference to one, it still parses some of
> > its arguments with parse_revision_opt() which assumes that the hash
> > algorithm is set. However, in c8aed5e8da (repository: stop setting SHA1
> > as the default object hash, 2024-05-07) we stopped setting up a default
> > hash algorithm and instead require commands to set it up explicitly.
> >
> > This was done for most other commands like in ab274909d4 (builtin/diff:
> > explicitly set hash algo when there is no repo, 2024-05-07) but was
> > missed for builtin/shortlog, making git-shortlog(1) segfault outside of
> > a repository when given arguments like --author that trigger a call to
> > parse_revision_opt().
> 
> Good analysis.
> 
> > Fix this for now by explicitly setting the hash algorithm to SHA1. Also
> > add a regression test for the segfault.
> 
> Makes sense.
> 
> > Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
> > Thanks-to: Eric Sunshine <sunshine@sunshineco.com>
> 
> In the future, please ensure that your Signed-off-by is the last trailer
> in the section to indicate that you have certified everything above it
> (which in that case would include your Thanks-to here).

I think it's also more common to use "Helped-by" instead of "Thanks-to".
I see you have the same trailer in v4, but don't necessarily think that
it is a good-enough reason to reroll.

Patrick
Wolfgang Müller Oct. 17, 2024, 12:09 p.m. UTC | #4
On 2024-10-17 13:58, Patrick Steinhardt wrote:
> I think it's also more common to use "Helped-by" instead of "Thanks-to".
> I see you have the same trailer in v4, but don't necessarily think that
> it is a good-enough reason to reroll.

That is good to know, thanks. Is there by chance any document that
outlines usage guidelines for when to use which trailer or is it more of
an undocumented custom?
Patrick Steinhardt Oct. 17, 2024, 12:11 p.m. UTC | #5
On Thu, Oct 17, 2024 at 02:09:12PM +0200, Wolfgang Müller wrote:
> On 2024-10-17 13:58, Patrick Steinhardt wrote:
> > I think it's also more common to use "Helped-by" instead of "Thanks-to".
> > I see you have the same trailer in v4, but don't necessarily think that
> > it is a good-enough reason to reroll.
> 
> That is good to know, thanks. Is there by chance any document that
> outlines usage guidelines for when to use which trailer or is it more of
> an undocumented custom?

We have Documentation/SubmittingPatches, which has a section named
"commit-trailers".

Patrick
diff mbox series

Patch

diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 3ed5c46078..c86b75d981 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -407,6 +407,18 @@  int cmd_shortlog(int argc,
 
 	struct parse_opt_ctx_t ctx;
 
+	/*
+	 * NEEDSWORK: Later on we'll call parse_revision_opt which relies on
+	 * the hash algorithm being set but since we are operating outside of a
+	 * Git repository we cannot determine one. This is only needed because
+	 * parse_revision_opt expects hexsz for --abbrev which is irrelevant
+	 * for shortlog outside of a git repository. For now explicitly set
+	 * SHA1, but ideally the parsing machinery would be split between
+	 * git/nongit so that we do not have to do this.
+	 */
+	if (nongit && !the_hash_algo)
+		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	git_config(git_default_config, NULL);
 	shortlog_init(&log);
 	repo_init_revisions(the_repository, &rev, prefix);
diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
index c20c885724..0cc2aa5089 100755
--- a/t/t4201-shortlog.sh
+++ b/t/t4201-shortlog.sh
@@ -143,6 +143,10 @@  fuzz()
 	test_grep "too many arguments" out
 '
 
+test_expect_success 'shortlog --author from non-git directory does not segfault' '
+	echo | nongit git shortlog --author=author
+'
+
 test_expect_success 'shortlog should add newline when input line matches wraplen' '
 	cat >expect <<\EOF &&
 A U Thor (2):