Message ID | 20240513224127.2042052-5-gitster@pobox.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Fix use of uninitialized hash algorithms | expand |
Junio C Hamano <gitster@pobox.com> writes: > From: Patrick Steinhardt <ps@pks.im> > > The git-hash-object(1) command allows users to hash an object even > without a repository. Starting with c8aed5e8da (repository: stop setting > SHA1 as the default object hash, 2024-05-07), this will make us hit an > uninitialized hash function, which subsequently leads to a segfault. > > Fix this by falling back to SHA-1 explicitly when running outside of a > Git repository. Users can use GIT_DEFAULT_HASH_ALGORITHM environment to > specify what hash algorithm they want, so arguably this code should not > be needed. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > builtin/hash-object.c | 3 +++ > t/t1007-hash-object.sh | 6 ++++++ > t/t1517-outside-repo.sh | 2 +- > 3 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/builtin/hash-object.c b/builtin/hash-object.c > index 82ca6d2bfd..c767414a0c 100644 > --- a/builtin/hash-object.c > +++ b/builtin/hash-object.c > @@ -123,6 +123,9 @@ int cmd_hash_object(int argc, const char **argv, const char *prefix) > else > prefix = setup_git_directory_gently(&nongit); > > + if (nongit && !the_hash_algo) > + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); This is slightly different from your version, which always used SHA-1 when nongit is true, in the hope that it would make it more useful if the user can say "GIT_DEFAULT_HASH_ALGORITHM=sha256 git hash-objects" outside a repository. I am not 100% convinced it is better or we rather should aim for predictability that you'd always and only get SHA-1 outside a repository.
On Mon, May 13, 2024 at 04:13:19PM -0700, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > From: Patrick Steinhardt <ps@pks.im> > > > > The git-hash-object(1) command allows users to hash an object even > > without a repository. Starting with c8aed5e8da (repository: stop setting > > SHA1 as the default object hash, 2024-05-07), this will make us hit an > > uninitialized hash function, which subsequently leads to a segfault. > > > > Fix this by falling back to SHA-1 explicitly when running outside of a > > Git repository. Users can use GIT_DEFAULT_HASH_ALGORITHM environment to > > specify what hash algorithm they want, so arguably this code should not > > be needed. > > > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > > Signed-off-by: Junio C Hamano <gitster@pobox.com> > > --- > > builtin/hash-object.c | 3 +++ > > t/t1007-hash-object.sh | 6 ++++++ > > t/t1517-outside-repo.sh | 2 +- > > 3 files changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/builtin/hash-object.c b/builtin/hash-object.c > > index 82ca6d2bfd..c767414a0c 100644 > > --- a/builtin/hash-object.c > > +++ b/builtin/hash-object.c > > @@ -123,6 +123,9 @@ int cmd_hash_object(int argc, const char **argv, const char *prefix) > > else > > prefix = setup_git_directory_gently(&nongit); > > > > + if (nongit && !the_hash_algo) > > + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); > > This is slightly different from your version, which always used SHA-1 > when nongit is true, in the hope that it would make it more useful if > the user can say "GIT_DEFAULT_HASH_ALGORITHM=sha256 git hash-objects" > outside a repository. I am not 100% convinced it is better or we > rather should aim for predictability that you'd always and only get > SHA-1 outside a repository. I'd prefer the latter -- always use SHA-1. As you say, it's easier to understand and doesn't create implicit mechanisms that we'll have to maintain going forward. Also, users didn't have a desire yet to pick a different algorithm than SHA-1, which probably also comes from the fact that SHA-256 repositories are still scarce. Eventually, we should then add a new option `--object-hash=` to git-hash-object(1) and other commands that may run outside of a Git repository to let the user pick their desired hash. Patrick
Patrick Steinhardt <ps@pks.im> writes: > I'd prefer the latter -- always use SHA-1. As you say, it's easier to > understand and doesn't create implicit mechanisms that we'll have to > maintain going forward. Also, users didn't have a desire yet to pick a > different algorithm than SHA-1, which probably also comes from the fact > that SHA-256 repositories are still scarce. > ... > Eventually, we should then add a new option `--object-hash=` to > git-hash-object(1) and other commands that may run outside of a Git > repository to let the user pick their desired hash. If we were planning to allow using something other than SHA-1 in the future, then I do not think I agree with your argument. We can use the same GIT_DEFAULT_HASH mechanism to specify what hash to use without later adding an extra option just fine.
diff --git a/builtin/hash-object.c b/builtin/hash-object.c index 82ca6d2bfd..c767414a0c 100644 --- a/builtin/hash-object.c +++ b/builtin/hash-object.c @@ -123,6 +123,9 @@ int cmd_hash_object(int argc, const char **argv, const char *prefix) else prefix = setup_git_directory_gently(&nongit); + if (nongit && !the_hash_algo) + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); + if (vpath && prefix) { vpath_free = prefix_filename(prefix, vpath); vpath = vpath_free; diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index 64aea38486..4c138c6ca4 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -260,4 +260,10 @@ test_expect_success '--literally with extra-long type' ' echo example | git hash-object -t $t --literally --stdin ' +test_expect_success '--stdin outside of repository' ' + nongit git hash-object --stdin <hello >actual && + echo "$(test_oid hello)" >expect && + test_cmp expect actual +' + test_done diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh index f1fd5c9888..80261e43c7 100755 --- a/t/t1517-outside-repo.sh +++ b/t/t1517-outside-repo.sh @@ -33,7 +33,7 @@ test_expect_success 'compute a patch-id outside repository' ' test_cmp patch-id.expect patch-id.actual ' -test_expect_failure 'hash-object outside repository' ' +test_expect_success 'hash-object outside repository' ' git hash-object --stdin <sample.patch >hash.expect && ( cd non-repo &&