Message ID | 20240520231434.1816979-1-gitster@pobox.com (mailing list archive) |
---|---|
Headers | show |
Series | Fix use of uninitialized hash algorithms | expand |
On Mon, May 20, 2024 at 04:14:29PM -0700, Junio C Hamano wrote: > A change recently merged to 'next' stops us from defaulting to using > SHA-1 unless other code (like a logic early in the start-up sequence > to see what hash is being used in the repository we are working in) > explicitly sets it, leading to a (deliberate) crash of "git" when we > forgot to cover certain code paths. > > It turns out we have a few. Notable ones are all operations that > are designed to work outside a repository. We should go over all > such code paths and give them a reasonable default when there is one > available (e.g. for historical reasons, patch-id is documented to > work with SHA-1 hashes, so arguably it, or at least when it is > invoked with the "--stable" option, should do so everywhere, not > just in SHA-1 repositories, but in SHA-256 repository or outside any > repository). In the meantime, if an end-user hits such a "bug" > before we can fix it, it would be nice to give them an escape hatch > to restore the historical behaviour of falling back to use SHA-1. > > These patches are designed to apply on a merge of c8aed5e8 > (repository: stop setting SHA1 as the default object hash, > 2024-05-07) into 3e4a232f (The third batch, 2024-05-13), which has > been the same base throughout the past iterations. > > In this fifth iteration: > > - The first step no longer falls back to GIT_DEFAULT_HASH; the > escape hatch is a dedicated GIT_TEST_DEFAULT_HASH_ALGO > environment variable, but hopefully we do not have to advertise > it all that often. > > - The second step has been simplified somewhat to use the "nongit" > helper when we only need to run a single "git" command in t1517. > The way the expected output files were prepared in the previous > versions did not correctly force use of SHA-1 algorithm, which > has been corrected. The third step and fourth step for t1517 > continue to be "flip expect_failure to expect_success", but you > can see context differences in the range-diff. > > - The fourth step also has a fix for t1007 where the previous > iterations did not correctly force use of SHA-1 to prepare the > expected output. > > Otherwise this round should be ready, modulo possible typoes. I have two smallish comments, but neither of them really have to be addressed. Overall I very much agree with this iteration and think that it's the right way to go. Thanks! Patrick
Patrick Steinhardt <ps@pks.im> writes: > I have two smallish comments, but neither of them really have to be > addressed. Overall I very much agree with this iteration and think that > it's the right way to go. I've locally done the following locally but it probably does not need to be resent to the list before merging down to 'next'. 1: b23a93597c ! 1: d3b2ff75fd setup: add an escape hatch for "no more default hash algorithm" change @@ Commit message default object hash, 2024-05-07), to keep end-user systems still broken when we have gap in our test coverage but yet give them an escape hatch to set the GIT_TEST_DEFAULT_HASH_ALGO environment - variable to "sha1" in order to revert to the previous behaviour. + variable to "sha1" in order to revert to the previous behaviour, in + case we haven't done a thorough job in fixing the fallout from + c8aed5e8. After we build confidence, we should remove the escape + hatch support, but we are not there yet after only fixing three + commands (hash-object, apply, and patch-id) in this series. Due to the way the end-user facing GIT_DEFAULT_HASH environment variable is used in our test suite, we unfortunately cannot reuse it 2: 6a20370944 = 2: abece6e970 t1517: test commands that are designed to be run outside repository 3: fa258c5d47 = 3: 4a1c95931f builtin/patch-id: fix uninitialized hash function 4: 164d340cbe = 4: 8d058b8024 builtin/hash-object: fix uninitialized hash function 5: bd0246eb51 ! 5: 4674ab682d apply: fix uninitialized hash function @@ Commit message Make sure we explicitly fall back to SHA-1 algorithm for backward compatibility. + It is of dubious value to make this configurable to other hash + algorithms, as the code does not use the_hash_algo for hashing + purposes when working outside a repository (which is how + the_hash_algo is left to NULL)---it is only used to learn the max + length of the hash when parsing the object names on the "index" + line, but failing to parse the "index" line is not a hard failure, + and the program does not support operations like applying binary + patches and --3way fallback that requires object access outside a + repository. + Signed-off-by: Junio C Hamano <gitster@pobox.com> ## builtin/apply.c ## @@ builtin/apply.c: int cmd_apply(int argc, const char **argv, const char *prefix) if (init_apply_state(&state, the_repository, prefix)) exit(128); ++ /* ++ * We could to redo the "apply.c" machinery to make this ++ * arbitrary fallback unnecessary, but it is dubious that it ++ * is worth the effort. ++ * cf. https://lore.kernel.org/git/xmqqcypfcmn4.fsf@gitster.g/ ++ */ + if (!the_hash_algo) + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); +
On Tue, May 21, 2024 at 11:07:12AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > I have two smallish comments, but neither of them really have to be > > addressed. Overall I very much agree with this iteration and think that > > it's the right way to go. > > I've locally done the following locally but it probably does not > need to be resent to the list before merging down to 'next'. Thanks, the diff looks good to me. Patrick