Message ID | 20200911233815.2808426-1-sandals@crustytoothpaste.net (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | builtin/clone: avoid failure with GIT_DEFAULT_HASH | expand |
On Fri, Sep 11, 2020 at 11:38:15PM +0000, brian m. carlson wrote: > Neither of those are appealing, so let's tell the repository > initialization code if we're doing a reinit like this, and if so, to > clear the extension if we're using SHA-1. This makes sure we produce a > valid and functional repository and doesn't break any of our other use > cases. Your explanation makes sense, and what you're proposing here seems like a very reasonable path forward. I can't think of anything that would cause it not to work as expected. > -void initialize_repository_version(int hash_algo); > +void initialize_repository_version(int hash_algo, int reinit); I'm not a huge fan of adding a 'reinit' parameter to a function that itself begins with the word 'initialize' (why wouldn't you call 'reinitialize_repository_version()' instead?), but seeing as there are only a couple of callers, maybe it is OK. Alternatively, I certainly wouldn't complain if you did introduce a new function and updated the call-site that passes reinit as 1. > +test_expect_success 'clone with GIT_DEFAULT_HASH' ' > + ( > + sane_unset GIT_DEFAULT_HASH && > + git init test > + ) && > + test_commit -C test foo && > + git clone test test-clone && > + git -C test-clone status > +' > + This test looks very reasonable, and certainly demonstrates the bug and fix. Thanks. With or without my nitpick: Reviewed-by: Taylor Blau <me@ttaylorr.com> Thanks, Taylor
On 2020-09-12 at 03:24:48, Taylor Blau wrote: > On Fri, Sep 11, 2020 at 11:38:15PM +0000, brian m. carlson wrote: > > -void initialize_repository_version(int hash_algo); > > +void initialize_repository_version(int hash_algo, int reinit); > > I'm not a huge fan of adding a 'reinit' parameter to a function that > itself begins with the word 'initialize' (why wouldn't you call > 'reinitialize_repository_version()' instead?), but seeing as there are > only a couple of callers, maybe it is OK. > > Alternatively, I certainly wouldn't complain if you did introduce a new > function and updated the call-site that passes reinit as 1. I thought about introducing a new function, but since it would share almost all of the code, it seemed a bit wasteful, even if the function is small. We do have only two callers, I believe, since I recall making this function non-static and calling it from clone, so I think it's okay. > > +test_expect_success 'clone with GIT_DEFAULT_HASH' ' > > + ( > > + sane_unset GIT_DEFAULT_HASH && > > + git init test > > + ) && > > + test_commit -C test foo && > > + git clone test test-clone && > > + git -C test-clone status > > +' > > + > > This test looks very reasonable, and certainly demonstrates the bug and > fix. Thanks. This is essentially Matheus's testcase. I considered using env -u to avoid the subshell but POSIX doesn't specify that, so we have a subshell.
"brian m. carlson" <sandals@crustytoothpaste.net> writes: >> Alternatively, I certainly wouldn't complain if you did introduce a new >> function and updated the call-site that passes reinit as 1. > > I thought about introducing a new function, but since it would share > almost all of the code, it seemed a bit wasteful, even if the function > is small. We do have only two callers, I believe, since I recall > making this function non-static and calling it from clone, so I think > it's okay. Perhaps. FWIW, this seems to have strange interaction with something in 'seen' when merged; I suspect it is the topic that mucks with the set-up sequence for "git clone", but didn't check.
Junio C Hamano <gitster@pobox.com> writes: > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > >>> Alternatively, I certainly wouldn't complain if you did introduce a new >>> function and updated the call-site that passes reinit as 1. >> >> I thought about introducing a new function, but since it would share >> almost all of the code, it seemed a bit wasteful, even if the function >> is small. We do have only two callers, I believe, since I recall >> making this function non-static and calling it from clone, so I think >> it's okay. > > Perhaps. > > FWIW, this seems to have strange interaction with something in > 'seen' when merged; I suspect it is the topic that mucks with the > set-up sequence for "git clone", but didn't check. Actually I have to take it back. I have this directly on top of v2.28.0 and it already breaks tests big time. For example, here is how "cd t && sh t0021-conversion.sh -i -v" breaks: ... Cloning into 'repo-cloned'... fatal: could not unset 'extensions.objectformat' fatal: the remote end hung up unexpectedly not ok 25 - delayed checkout in process filter The story is the same if it is applied on top of 'master' (which is expected, as we haven't done anything to affect this area since v2.28.0).
On 2020-09-14 at 21:44:45, Junio C Hamano wrote: > Actually I have to take it back. I have this directly on top of > v2.28.0 and it already breaks tests big time. For example, here is > how "cd t && sh t0021-conversion.sh -i -v" breaks: > > ... > Cloning into 'repo-cloned'... > fatal: could not unset 'extensions.objectformat' > fatal: the remote end hung up unexpectedly > not ok 25 - delayed checkout in process filter > > The story is the same if it is applied on top of 'master' (which is > expected, as we haven't done anything to affect this area since > v2.28.0). Ah, yes. I build with GIT_TEST_DEFAULT_HASH=sha256, but if you run the tests with SHA-1, it fails. Feel free to drop it for now and I'll send out a v2 soon. Sorry about that.
diff --git a/builtin/clone.c b/builtin/clone.c index b087ee40c2..925a2e3dd6 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -1235,7 +1235,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) * Now that we know what algorithm the remote side is using, * let's set ours to the same thing. */ - initialize_repository_version(hash_algo); + initialize_repository_version(hash_algo, 1); repo_set_hash_algo(the_repository, hash_algo); mapped_refs = wanted_peer_refs(refs, &remote->fetch); diff --git a/builtin/init-db.c b/builtin/init-db.c index cd3e760541..57e15e93da 100644 --- a/builtin/init-db.c +++ b/builtin/init-db.c @@ -179,7 +179,7 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree) return 1; } -void initialize_repository_version(int hash_algo) +void initialize_repository_version(int hash_algo, int reinit) { char repo_version_string[10]; int repo_version = GIT_REPO_VERSION; @@ -195,6 +195,8 @@ void initialize_repository_version(int hash_algo) if (hash_algo != GIT_HASH_SHA1) git_config_set("extensions.objectformat", hash_algos[hash_algo].name); + else if (reinit) + git_config_set("extensions.objectformat", NULL); } static int create_default_files(const char *template_path, @@ -277,7 +279,7 @@ static int create_default_files(const char *template_path, free(ref); } - initialize_repository_version(fmt->hash_algo); + initialize_repository_version(fmt->hash_algo, 0); /* Check filemode trustability */ path = git_path_buf(&buf, "config"); diff --git a/cache.h b/cache.h index cee8aa5dc3..c0072d43b1 100644 --- a/cache.h +++ b/cache.h @@ -629,7 +629,7 @@ int path_inside_repo(const char *prefix, const char *path); int init_db(const char *git_dir, const char *real_git_dir, const char *template_dir, int hash_algo, const char *initial_branch, unsigned int flags); -void initialize_repository_version(int hash_algo); +void initialize_repository_version(int hash_algo, int reinit); void sanitize_stdfds(void); int daemonize(void); diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 15fb64c18d..570d989795 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -631,6 +631,16 @@ test_expect_success CASE_INSENSITIVE_FS 'colliding file detection' ' test_i18ngrep "the following paths have collided" icasefs/warning ' +test_expect_success 'clone with GIT_DEFAULT_HASH' ' + ( + sane_unset GIT_DEFAULT_HASH && + git init test + ) && + test_commit -C test foo && + git clone test test-clone && + git -C test-clone status +' + partial_clone_server () { SERVER="$1" &&
If a user is cloning a SHA-1 repository with GIT_DEFAULT_HASH set to "sha256", then we can end up with a repository where the repository format version is 0 but the extensions.objectformat key is set to "sha256". This is both wrong (the user has a SHA-1 repository) and nonfunctional (because the extension cannot be used in a v0 repository). This happens because in a clone, we initially set up the repository, and then change its algorithm based on what the remote side tells us it's using. We've initially set up the repository as SHA-256 in this case, and then later on reset the repository version without clearing the extension. We could just always set the extension in this case, but that would mean that our SHA-1 repositories weren't compatible with older Git versions, even though there's no reason why they shouldn't be. And we also don't want to initialize the repository as SHA-1 initially, since that means if we're cloning an empty repository, we'll have failed to honor the GIT_DEFAULT_HASH variable and will end up with a SHA-1 repository, not a SHA-256 repository. Neither of those are appealing, so let's tell the repository initialization code if we're doing a reinit like this, and if so, to clear the extension if we're using SHA-1. This makes sure we produce a valid and functional repository and doesn't break any of our other use cases. Reported-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> --- builtin/clone.c | 2 +- builtin/init-db.c | 6 ++++-- cache.h | 2 +- t/t5601-clone.sh | 10 ++++++++++ 4 files changed, 16 insertions(+), 4 deletions(-)