Message ID | 20181218072528.3870492-3-martin.agren@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | setup: add `clear_repository_format()` | expand |
On Tue, Dec 18, 2018 at 08:25:27AM +0100, Martin Ågren wrote: > I fully admit to not understanding all of this setup code, neither in > its current incarnation, nor in terms of an ideal end game. This check > seems like a good thing to do though. It's definitely complex. > diff --git a/setup.c b/setup.c > index 27747af7a3..52c3c9d31f 100644 > --- a/setup.c > +++ b/setup.c > @@ -1138,7 +1138,7 @@ const char *setup_git_directory_gently(int *nongit_ok) > gitdir = DEFAULT_GIT_DIR_ENVIRONMENT; > setup_git_env(gitdir); > } > - if (startup_info->have_repository) > + if (startup_info->have_repository && repo_fmt.version > -1) > repo_set_hash_algo(the_repository, repo_fmt.hash_algo); > } I think this change is fine, because we initialize the value in the_repository elsewhere, and if there's no repository, this should never have a value other than the default anyway. I looked at the other patches in the series and thought they looked sane as well.
On Tue, Dec 18, 2018 at 08:25:27AM +0100, Martin Ågren wrote: > If `read_repository_format()` encounters an error, `format->version` > will be -1 and all other fields of `format` will be undefined. However, > in `setup_git_directory_gently()`, we use `repo_fmt.hash_algo` > regardless of the value of `repo_fmt.version`. > > This can be observed by adding this to the end of > `read_repository_format()`: > > if (format->version == -1) > format->hash_algo = 0; /* no-one should peek at this! */ > > This causes, e.g., "git branch -m q q2 without config should succeed" in > t3200 to fail with "fatal: Failed to resolve HEAD as a valid ref." > because it has moved .git/config out of the way and is now trying to use > a bad hash algorithm. > > Check that `version` is non-negative before using `hash_algo`. > > This patch adds no tests, but do note that if we skip this patch, the > next patch would cause existing tests to fail as outlined above. > > Signed-off-by: Martin Ågren <martin.agren@gmail.com> Hmm. It looks like we never set repo_fmt.hash_algo to anything besides GIT_HASH_SHA1 anyway. I guess the existing field is really just there in preparation for us eventually respecting extensions.hashAlgorithm (or whatever it's called). Given what I said in my previous email about repos with a missing "version" field, I wondered if this patch would be breaking config like: [core] # no repositoryformatversion! [extensions] hashAlgorithm = sha256 But I'd argue that: 1. That's pretty dumb config that we shouldn't need to support. Even if we care about handling the missing version for historical repos, they wouldn't be talking sha256. 2. Arguably we should not even look at extensions.* unless we see a version >= 1. But we do process them as we parse the config file. This is mostly an oversight, I think. We have to handle them as we see them, because they may come out of order with respect to the repositoryformatversion field. But we could put them into a string_list, and then only process them after we've decided which version we have. So I think your patch is doing the right thing, and won't hurt any real cases. But (of course) there are more opportunities to clean things up. -Peff
On Wed, 19 Dec 2018 at 01:18, brian m. carlson <sandals@crustytoothpaste.net> wrote: > I think this change is fine, because we initialize the value in > the_repository elsewhere, and if there's no repository, this should > never have a value other than the default anyway. Thanks, it feels good that this patch matches how you think about the `hash_algo` field. > I looked at the other patches in the series and thought they looked sane > as well. Thanks for a review, I appreciate it. Martin
On Wed, 19 Dec 2018 at 16:38, Jeff King <peff@peff.net> wrote: > > On Tue, Dec 18, 2018 at 08:25:27AM +0100, Martin Ågren wrote: > > > Check that `version` is non-negative before using `hash_algo`. > Hmm. It looks like we never set repo_fmt.hash_algo to anything besides > GIT_HASH_SHA1 anyway. I guess the existing field is really just there in > preparation for us eventually respecting extensions.hashAlgorithm (or > whatever it's called). That was my understanding as well. Maybe I should have spelled it out. I think of the diff of this patch as "let's check `foo->valid` before we `use(foo->bar)`", which should only be able to regress in case foo isn't valid. And ... > Given what I said in my previous email about repos with a missing > "version" field, I wondered if this patch would be breaking config like: > > [core] > # no repositoryformatversion! > [extensions] > hashAlgorithm = sha256 > > But I'd argue that: > > 1. That's pretty dumb config that we shouldn't need to support. Even > if we care about handling the missing version for historical repos, > they wouldn't be talking sha256. ... this matches my thinking. > 2. Arguably we should not even look at extensions.* unless we see a > version >= 1. But we do process them as we parse the config file. > This is mostly an oversight, I think. We have to handle them as we > see them, because they may come out of order with respect to the > repositoryformatversion field. But we could put them into a > string_list, and then only process them after we've decided which > version we have. I hadn't thought too much about this. I guess that for some simpler extensions--versions dependencies it would be feasible to first parse everything, then, depending on the version we've identified, forget about any "irrelevant" extensions. Again, nothing I've thought much about, and seems to be safely out of scope for this patch. > So I think your patch is doing the right thing, and won't hurt any real > cases. But (of course) there are more opportunities to clean things up.
On Wed, Dec 19, 2018 at 10:46:52PM +0100, Martin Ågren wrote: > > 2. Arguably we should not even look at extensions.* unless we see a > > version >= 1. But we do process them as we parse the config file. > > This is mostly an oversight, I think. We have to handle them as we > > see them, because they may come out of order with respect to the > > repositoryformatversion field. But we could put them into a > > string_list, and then only process them after we've decided which > > version we have. > > I hadn't thought too much about this. I guess that for some simpler > extensions--versions dependencies it would be feasible to first parse > everything, then, depending on the version we've identified, forget > about any "irrelevant" extensions. Again, nothing I've thought much > about, and seems to be safely out of scope for this patch. The decision is actually pretty straight-forward: if version < 1, ignore extensions, otherwise respect them (and complain about any we don't know about). So I think we could just do in verify_repository_format() something like: if (version < 1) { /* "undo" any extensions we might have parsed */ data->precious_objects = 0; FREE_AND_NULL(data->partial_clone); data->worktree_config = 0; data->hash_algo = GIT_HASH_SHA1; } else { /* complain about unknown extension; we already do this! */ } It's a little ugly to have to know about all the extensions here, but we already initialize them in read_repository_format(). We could probably factor that out into a shared function. -Peff
On Wed, Dec 19, 2018 at 10:38:41AM -0500, Jeff King wrote: > Hmm. It looks like we never set repo_fmt.hash_algo to anything besides > GIT_HASH_SHA1 anyway. I guess the existing field is really just there in > preparation for us eventually respecting extensions.hashAlgorithm (or > whatever it's called). Yeah, it is. I haven't tested, but since we just read the value of extensions.objectFormat, this patch shouldn't have any effect on the SHA-256 code. The default remains SHA-1 if a value isn't specified somehow.
diff --git a/setup.c b/setup.c index 27747af7a3..52c3c9d31f 100644 --- a/setup.c +++ b/setup.c @@ -1138,7 +1138,7 @@ const char *setup_git_directory_gently(int *nongit_ok) gitdir = DEFAULT_GIT_DIR_ENVIRONMENT; setup_git_env(gitdir); } - if (startup_info->have_repository) + if (startup_info->have_repository && repo_fmt.version > -1) repo_set_hash_algo(the_repository, repo_fmt.hash_algo); }
If `read_repository_format()` encounters an error, `format->version` will be -1 and all other fields of `format` will be undefined. However, in `setup_git_directory_gently()`, we use `repo_fmt.hash_algo` regardless of the value of `repo_fmt.version`. This can be observed by adding this to the end of `read_repository_format()`: if (format->version == -1) format->hash_algo = 0; /* no-one should peek at this! */ This causes, e.g., "git branch -m q q2 without config should succeed" in t3200 to fail with "fatal: Failed to resolve HEAD as a valid ref." because it has moved .git/config out of the way and is now trying to use a bad hash algorithm. Check that `version` is non-negative before using `hash_algo`. This patch adds no tests, but do note that if we skip this patch, the next patch would cause existing tests to fail as outlined above. Signed-off-by: Martin Ågren <martin.agren@gmail.com> --- I fully admit to not understanding all of this setup code, neither in its current incarnation, nor in terms of an ideal end game. This check seems like a good thing to do though. setup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)