Message ID | 20241026120950.72727-1-abhijeet.nkt@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v3] show-index: fix uninitialized hash function | expand |
On Sat, Oct 26, 2024 at 05:39:50PM +0530, Abhijeet Sonar wrote: > As stated in the docs, show-index should use SHA1 as the default hash algorithm > when run outsize of a repository. However, 'the_hash_algo' is currently left > uninitialized if we are not in a repository and no explicit hash function is > specified, causing a crash. Fix it by falling back to SHA1 when it is found > uninitialized. Also add test that verifies this behaviour. This commit description is good, and would benefit further from a bisection showing where the regression began. I don't think that it is a prerequisite for us moving this patch forward, though. > Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> > --- > builtin/show-index.c | 3 +++ > t/t5300-pack-object.sh | 4 ++++ > 2 files changed, 7 insertions(+) > > diff --git a/builtin/show-index.c b/builtin/show-index.c > index f164c01bbe..978ae70470 100644 > --- a/builtin/show-index.c > +++ b/builtin/show-index.c > @@ -38,6 +38,9 @@ int cmd_show_index(int argc, > repo_set_hash_algo(the_repository, hash_algo); > } > > + if (!the_hash_algo) > + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); > + > hashsz = the_hash_algo->rawsz; > > if (fread(top_index, 2 * 4, 1, stdin) != 1) > diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh > index 3b9dae331a..51fed26cc4 100755 > --- a/t/t5300-pack-object.sh > +++ b/t/t5300-pack-object.sh > @@ -523,6 +523,10 @@ test_expect_success 'index-pack --strict <pack> works in non-repo' ' > test_path_is_file foo.idx > ' > > +test_expect_success SHA1 'show-index works OK outside a repository' ' > + nongit git show-index <foo.idx > +' > + > test_expect_success !PTHREADS,!FAIL_PREREQS \ > 'index-pack --threads=N or pack.threads=N warns when no pthreads' ' > test_must_fail git index-pack --threads=2 2>err && > -- > 2.47.0.107.g34b6ce9b30 These all look reasonable and as-expected to me. Patrick (CC'd) has been reviewing similar changes elsewhere, so I'd like him to chime in as well on whether or not this looks good to go. Thanks, Taylor
On Sun, Oct 27, 2024 at 08:10:16PM -0400, Taylor Blau wrote: > On Sat, Oct 26, 2024 at 05:39:50PM +0530, Abhijeet Sonar wrote: > > diff --git a/builtin/show-index.c b/builtin/show-index.c > > index f164c01bbe..978ae70470 100644 > > --- a/builtin/show-index.c > > +++ b/builtin/show-index.c > > @@ -38,6 +38,9 @@ int cmd_show_index(int argc, > > repo_set_hash_algo(the_repository, hash_algo); > > } > > > > + if (!the_hash_algo) > > + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); Let's add a todo-comment here. The behaviour with this patch is somewhat broken as you cannot inspect indices that use any other object hash than SHA256 outside of a repository. This is fine from my point of view and nothing that you have to fix here, as you simply fix up the broken behaviour. But in the future, we should either: - Add logic to detect the format of the passed-in index and set that up as the hash algorithm. - If that is impossible, add a command line option to pick the hash algo. > > hashsz = the_hash_algo->rawsz; > > > > if (fread(top_index, 2 * 4, 1, stdin) != 1) > > diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh > > index 3b9dae331a..51fed26cc4 100755 > > --- a/t/t5300-pack-object.sh > > +++ b/t/t5300-pack-object.sh > > @@ -523,6 +523,10 @@ test_expect_success 'index-pack --strict <pack> works in non-repo' ' > > test_path_is_file foo.idx > > ' > > > > +test_expect_success SHA1 'show-index works OK outside a repository' ' > > + nongit git show-index <foo.idx > > +' So how does this behave with SHA256? Does it raise an error? Does it segfault? I think it's okay to fail with SHA256 for now, but I'd like the failure behaviour to be cleanish. So I'd prefer to not skip the test completely, but adapt our expectations based on the hash algo. Or have two separate tests, one for each hash, that explicitly init the repo with `git init --ref-format=$hash`, and then exercise the behaviour for each of them. > > test_expect_success !PTHREADS,!FAIL_PREREQS \ > > 'index-pack --threads=N or pack.threads=N warns when no pthreads' ' > > test_must_fail git index-pack --threads=2 2>err && > > -- > > 2.47.0.107.g34b6ce9b30 > > These all look reasonable and as-expected to me. Patrick (CC'd) has been > reviewing similar changes elsewhere, so I'd like him to chime in as well > on whether or not this looks good to go. Ah, thanks. I've missed this topic somehow. Patrick
On Mon, Oct 28, 2024 at 06:35:15AM +0100, Patrick Steinhardt wrote: > > > test_expect_success !PTHREADS,!FAIL_PREREQS \ > > > 'index-pack --threads=N or pack.threads=N warns when no pthreads' ' > > > test_must_fail git index-pack --threads=2 2>err && > > > -- > > > 2.47.0.107.g34b6ce9b30 > > > > These all look reasonable and as-expected to me. Patrick (CC'd) has been > > reviewing similar changes elsewhere, so I'd like him to chime in as well > > on whether or not this looks good to go. > > Ah, thanks. I've missed this topic somehow. Not a problem at all. Thanks for a very helpful review. Abhijeet: I've gone ahead and marked this in my notes as "expecting another round" to address the feedback from Patrick. I'll keep my eyes out for the new version of this patch. Thanks! Thanks, Taylor
On 28/10/24 05:40, Taylor Blau wrote: > On Sat, Oct 26, 2024 at 05:39:50PM +0530, Abhijeet Sonar wrote: >> As stated in the docs, show-index should use SHA1 as the default hash algorithm >> when run outsize of a repository. However, 'the_hash_algo' is currently left >> uninitialized if we are not in a repository and no explicit hash function is >> specified, causing a crash. Fix it by falling back to SHA1 when it is found >> uninitialized. Also add test that verifies this behaviour. > > This commit description is good, and would benefit further from a > bisection showing where the regression began. I don't think that it is a > prerequisite for us moving this patch forward, though. On bisecting, the offending commit appears to be c8aed5e8da (repository: stop setting SHA1 as the default object hash). Another related commit is ab274909d4 (builtin/diff: explicitly set hash algo when there is no repo) which did something very similar to my patch. Thanks
On 28/10/24 11:05, Patrick Steinhardt wrote: > On Sun, Oct 27, 2024 at 08:10:16PM -0400, Taylor Blau wrote: >> On Sat, Oct 26, 2024 at 05:39:50PM +0530, Abhijeet Sonar wrote: >>> diff --git a/builtin/show-index.c b/builtin/show-index.c >>> index f164c01bbe..978ae70470 100644 >>> --- a/builtin/show-index.c >>> +++ b/builtin/show-index.c >>> @@ -38,6 +38,9 @@ int cmd_show_index(int argc, >>> repo_set_hash_algo(the_repository, hash_algo); >>> } >>> >>> + if (!the_hash_algo) >>> + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); > > Let's add a todo-comment here. The behaviour with this patch is somewhat > broken as you cannot inspect indices that use any other object hash than > SHA256 outside of a repository. This is fine from my point of view and > nothing that you have to fix here, as you simply fix up the broken > behaviour. But in the future, we should either: I will add those comments, thanks. > > - Add logic to detect the format of the passed-in index and set that > up as the hash algorithm I am very interested in hacking around and trying to implement this. I read about the index format here: https://git-scm.com/docs/index-format and (assuming I am looking at the right thing) it does not seem like index files contain information about the hash algorithm used in their headers or anywhere else. Are there other leads I should follow? > > - If that is impossible, add a command line option to pick the hash > algo. > Actually there is already an option (--object-format) that allows changing the hash algo used, it's just that default format is SHA1 when one is not specified. (or at least will be, after this patch) >>> hashsz = the_hash_algo->rawsz; >>> >>> if (fread(top_index, 2 * 4, 1, stdin) != 1) >>> diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh >>> index 3b9dae331a..51fed26cc4 100755 >>> --- a/t/t5300-pack-object.sh >>> +++ b/t/t5300-pack-object.sh >>> @@ -523,6 +523,10 @@ test_expect_success 'index-pack --strict <pack> works in non-repo' ' >>> test_path_is_file foo.idx >>> ' >>> >>> +test_expect_success SHA1 'show-index works OK outside a repository' ' >>> + nongit git show-index <foo.idx >>> +' > > So how does this behave with SHA256? Does it raise an error? Does it > segfault? > > I think it's okay to fail with SHA256 for now, but I'd like the > failure behaviour to be cleanish. So I'd prefer to not skip the test > completely, but adapt our expectations based on the hash algo. Or have > two separate tests, one for each hash, that explicitly init the repo > with `git init --ref-format=$hash`, and then exercise the behaviour for > each of the> Running show-index outside of a repository with a SHA256 based index file gives: $ git show-index <foo.idx fatal: inconsistent 64b offset index At the very least, it does not crash. >>> test_expect_success !PTHREADS,!FAIL_PREREQS \ >>> 'index-pack --threads=N or pack.threads=N warns when no pthreads' ' >>> test_must_fail git index-pack --threads=2 2>err && >>> -- >>> 2.47.0.107.g34b6ce9b30 >> >> These all look reasonable and as-expected to me. Patrick (CC'd) has been >> reviewing similar changes elsewhere, so I'd like him to chime in as well >> on whether or not this looks good to go. > > Ah, thanks. I've missed this topic somehow. > > Patrick Thanks
diff --git a/builtin/show-index.c b/builtin/show-index.c index f164c01bbe..978ae70470 100644 --- a/builtin/show-index.c +++ b/builtin/show-index.c @@ -38,6 +38,9 @@ int cmd_show_index(int argc, repo_set_hash_algo(the_repository, hash_algo); } + if (!the_hash_algo) + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); + hashsz = the_hash_algo->rawsz; if (fread(top_index, 2 * 4, 1, stdin) != 1) diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh index 3b9dae331a..51fed26cc4 100755 --- a/t/t5300-pack-object.sh +++ b/t/t5300-pack-object.sh @@ -523,6 +523,10 @@ test_expect_success 'index-pack --strict <pack> works in non-repo' ' test_path_is_file foo.idx ' +test_expect_success SHA1 'show-index works OK outside a repository' ' + nongit git show-index <foo.idx +' + test_expect_success !PTHREADS,!FAIL_PREREQS \ 'index-pack --threads=N or pack.threads=N warns when no pthreads' ' test_must_fail git index-pack --threads=2 2>err &&
As stated in the docs, show-index should use SHA1 as the default hash algorithm when run outsize of a repository. However, 'the_hash_algo' is currently left uninitialized if we are not in a repository and no explicit hash function is specified, causing a crash. Fix it by falling back to SHA1 when it is found uninitialized. Also add test that verifies this behaviour. Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> --- builtin/show-index.c | 3 +++ t/t5300-pack-object.sh | 4 ++++ 2 files changed, 7 insertions(+)