Message ID | 92a71222e1067ca4ce9ecaaa555d78d0cce0d9d0.1714637671.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Clarify pseudo-ref terminology | expand |
On Thu, May 02, 2024 at 10:17:42AM +0200, Patrick Steinhardt wrote: > Before this patch series, root refs except for "HEAD" and our special > refs were classified as pseudorefs. Furthermore, our terminology > clarified that pseudorefs must not be symbolic refs. This restriction > is enforced in `is_root_ref()`, which explicitly checks that a supposed > root ref resolves to an object ID without recursing. > > This has been extremely confusing right from the start because (in old > terminology) a ref name may sometimes be a pseudoref and sometimes not > depending on whether it is a symbolic or regular ref. This behaviour > does not seem reasonable at all and I very much doubt that it results in > anything sane. > > Furthermore, the behaviour is different to `is_headref()`, which only > checks for the ref to exist. While that is in line with our glossary, > this inconsistency only adds to the confusion. > > Last but not least, the current behaviour can actually lead to a > segfault when calling `is_root_ref()` with a reference that either does > not exist or that is a symbolic ref because we never initialized `oid`. > > Let's loosen the restrictions in accordance to the new definition of > root refs, which are simply plain refs that may as well be a symbolic > ref. Consequently, we can just check for the ref to exist instead of > requiring it to be a regular ref. It's not clear to me that this existence check is particularly useful. Something that fails read_raw_ref() will fail if: - the file does not exist at all. But then how did somebody find out about it at all to ask is_pseudoref()? - it does exist, but does not look like a ref. Is this important? If I do "echo foo >.git/CHERRY_PICK_HEAD", does it become not a root ref anymore? Or is it a root ref that is broken? I'd have thought the latter, and the syntax is what distinguishes it. Making the classification purely syntactic based on the name feels simpler to me to reason about. You'll never run into confusing cases where repo state changes how commands may behave. And arguably is_pseudoref_syntax() should be taking into account the "_HEAD" restriction and special names anyway. It is a bit weird that even if we tighten up the refname checking to use is_pseudoref_syntax(), you'd still be able to "git update-ref FOO" but then not see it as a root ref! -Peff
On Fri, May 03, 2024 at 02:13:39PM -0400, Jeff King wrote: > On Thu, May 02, 2024 at 10:17:42AM +0200, Patrick Steinhardt wrote: > > > Before this patch series, root refs except for "HEAD" and our special > > refs were classified as pseudorefs. Furthermore, our terminology > > clarified that pseudorefs must not be symbolic refs. This restriction > > is enforced in `is_root_ref()`, which explicitly checks that a supposed > > root ref resolves to an object ID without recursing. > > > > This has been extremely confusing right from the start because (in old > > terminology) a ref name may sometimes be a pseudoref and sometimes not > > depending on whether it is a symbolic or regular ref. This behaviour > > does not seem reasonable at all and I very much doubt that it results in > > anything sane. > > > > Furthermore, the behaviour is different to `is_headref()`, which only > > checks for the ref to exist. While that is in line with our glossary, > > this inconsistency only adds to the confusion. > > > > Last but not least, the current behaviour can actually lead to a > > segfault when calling `is_root_ref()` with a reference that either does > > not exist or that is a symbolic ref because we never initialized `oid`. > > > > Let's loosen the restrictions in accordance to the new definition of > > root refs, which are simply plain refs that may as well be a symbolic > > ref. Consequently, we can just check for the ref to exist instead of > > requiring it to be a regular ref. > > It's not clear to me that this existence check is particularly useful. > Something that fails read_raw_ref() will fail if: > > - the file does not exist at all. But then how did somebody find out > about it at all to ask is_pseudoref()? > > - it does exist, but does not look like a ref. Is this important? If I > do "echo foo >.git/CHERRY_PICK_HEAD", does it become not a root ref > anymore? Or is it a root ref that is broken? I'd have thought the > latter, and the syntax is what distinguishes it. > > Making the classification purely syntactic based on the name feels > simpler to me to reason about. You'll never run into confusing cases > where repo state changes how commands may behave. I certainly agree and have been complaining about that in the past, too. I didn't dare to change the semantics this far yet. Let's have a look at the callers: - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs. It's clear that the intent is to classify based on the ref name, only. - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to determine whether it should add a ref to the root directory. It feels fishy that this uses ref existence checks to do that. - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to filter root refs. Again, using existence checks is pointless here as the iterator has just surfaced the ref, so it does exist. - "refs.c:is_current_worktree_ref()" uses it. Fishy as well, as the call to `is_per_worktree_ref()` also only checks for the refname. So let's remove these existence checks altogether and make this a check that purely checks semantics. > And arguably is_pseudoref_syntax() should be taking into account the > "_HEAD" restriction and special names anyway. It is a bit weird that > even if we tighten up the refname checking to use is_pseudoref_syntax(), > you'd still be able to "git update-ref FOO" but then not see it as a > root ref! True, as well. I'm less comfortable with doing that change in this series though as it does impose a major restriction that did not exist previously. We probably want some escape hatches so that it would still be possible to modify those refs when really required, for example to delete such broken refs. I would thus like to defer this to a follow up patch series, if you don't mind. Patrick
On Wed, May 15, 2024 at 06:16:18AM +0200, Patrick Steinhardt wrote: > On Fri, May 03, 2024 at 02:13:39PM -0400, Jeff King wrote: > > On Thu, May 02, 2024 at 10:17:42AM +0200, Patrick Steinhardt wrote: [snip] > > And arguably is_pseudoref_syntax() should be taking into account the > > "_HEAD" restriction and special names anyway. It is a bit weird that > > even if we tighten up the refname checking to use is_pseudoref_syntax(), > > you'd still be able to "git update-ref FOO" but then not see it as a > > root ref! > > True, as well. I'm less comfortable with doing that change in this > series though as it does impose a major restriction that did not exist > previously. We probably want some escape hatches so that it would still > be possible to modify those refs when really required, for example to > delete such broken refs. > > I would thus like to defer this to a follow up patch series, if you > don't mind. Arguably, we don't need `is_pseudoref_syntax()` (which is being renamed to `is_root_ref_syntax()`) at all anymore after this series lands because it can be neatly rolled into `is_root_ref()`. The only caller, `is_current_worktree_ref()`, should really call `is_roof_ref()` and not `is_root_ref_syntax()`. But again, I'll defer this to a follow-up patch series. Patrick
On Wed, May 15, 2024 at 06:16:18AM +0200, Patrick Steinhardt wrote: > > Making the classification purely syntactic based on the name feels > > simpler to me to reason about. You'll never run into confusing cases > > where repo state changes how commands may behave. > > I certainly agree and have been complaining about that in the past, too. > I didn't dare to change the semantics this far yet. Let's have a look at > the callers: > > - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs. > It's clear that the intent is to classify based on the ref name, > only. > > - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to > determine whether it should add a ref to the root directory. It > feels fishy that this uses ref existence checks to do that. > > - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to > filter root refs. Again, using existence checks is pointless here as > the iterator has just surfaced the ref, so it does exist. > > - "refs.c:is_current_worktree_ref()" uses it. Fishy as well, as the > call to `is_per_worktree_ref()` also only checks for the refname. > > So let's remove these existence checks altogether and make this a check > that purely checks semantics. Thanks for doing the digging on the callers. That matches my intuition / light analysis, which is good. ;) > > And arguably is_pseudoref_syntax() should be taking into account the > > "_HEAD" restriction and special names anyway. It is a bit weird that > > even if we tighten up the refname checking to use is_pseudoref_syntax(), > > you'd still be able to "git update-ref FOO" but then not see it as a > > root ref! > > True, as well. I'm less comfortable with doing that change in this > series though as it does impose a major restriction that did not exist > previously. We probably want some escape hatches so that it would still > be possible to modify those refs when really required, for example to > delete such broken refs. > > I would thus like to defer this to a follow up patch series, if you > don't mind. I don't mind deferring. I thought it might make the simplifications you're doing in this series easier to reason about. But TBH I haven't had the chance to look through your series very carefully yet (and I'm still a bit back-logged), so I'm happy to go with your judgement on how to split it up. -Peff
On Wed, May 15, 2024 at 06:39:52AM +0200, Patrick Steinhardt wrote: > On Wed, May 15, 2024 at 06:16:18AM +0200, Patrick Steinhardt wrote: > > On Fri, May 03, 2024 at 02:13:39PM -0400, Jeff King wrote: > > > On Thu, May 02, 2024 at 10:17:42AM +0200, Patrick Steinhardt wrote: > [snip] > > > And arguably is_pseudoref_syntax() should be taking into account the > > > "_HEAD" restriction and special names anyway. It is a bit weird that > > > even if we tighten up the refname checking to use is_pseudoref_syntax(), > > > you'd still be able to "git update-ref FOO" but then not see it as a > > > root ref! > > > > True, as well. I'm less comfortable with doing that change in this > > series though as it does impose a major restriction that did not exist > > previously. We probably want some escape hatches so that it would still > > be possible to modify those refs when really required, for example to > > delete such broken refs. > > > > I would thus like to defer this to a follow up patch series, if you > > don't mind. > > Arguably, we don't need `is_pseudoref_syntax()` (which is being renamed > to `is_root_ref_syntax()`) at all anymore after this series lands > because it can be neatly rolled into `is_root_ref()`. The only caller, > `is_current_worktree_ref()`, should really call `is_roof_ref()` and not > `is_root_ref_syntax()`. Yeah, and I'd expect that the more-strict check_refname_format() that I proposed elsewhere would be in the same boat. The only reason I used the "_syntax()" variant is that it was obviously wrong to do existence checks there. Once those are gone, then naturally it should be able to rely on is_root_ref() itself. -Peff
On Wed, May 15, 2024 at 02:22:20AM -0400, Jeff King wrote: > On Wed, May 15, 2024 at 06:39:52AM +0200, Patrick Steinhardt wrote: > > > On Wed, May 15, 2024 at 06:16:18AM +0200, Patrick Steinhardt wrote: > > > On Fri, May 03, 2024 at 02:13:39PM -0400, Jeff King wrote: > > > > On Thu, May 02, 2024 at 10:17:42AM +0200, Patrick Steinhardt wrote: > > [snip] > > > > And arguably is_pseudoref_syntax() should be taking into account the > > > > "_HEAD" restriction and special names anyway. It is a bit weird that > > > > even if we tighten up the refname checking to use is_pseudoref_syntax(), > > > > you'd still be able to "git update-ref FOO" but then not see it as a > > > > root ref! > > > > > > True, as well. I'm less comfortable with doing that change in this > > > series though as it does impose a major restriction that did not exist > > > previously. We probably want some escape hatches so that it would still > > > be possible to modify those refs when really required, for example to > > > delete such broken refs. > > > > > > I would thus like to defer this to a follow up patch series, if you > > > don't mind. > > > > Arguably, we don't need `is_pseudoref_syntax()` (which is being renamed > > to `is_root_ref_syntax()`) at all anymore after this series lands > > because it can be neatly rolled into `is_root_ref()`. The only caller, > > `is_current_worktree_ref()`, should really call `is_roof_ref()` and not > > `is_root_ref_syntax()`. > > Yeah, and I'd expect that the more-strict check_refname_format() that I > proposed elsewhere would be in the same boat. The only reason I used the > "_syntax()" variant is that it was obviously wrong to do existence > checks there. Once those are gone, then naturally it should be able to > rely on is_root_ref() itself. This series hasn't been queued/merged yet, right? Do you plan to reroll it? I think that the changes in there are a good complementary addition to the clarifications in my patch series. Patrick
On Wed, May 15, 2024 at 08:35:52AM +0200, Patrick Steinhardt wrote: > > Yeah, and I'd expect that the more-strict check_refname_format() that I > > proposed elsewhere would be in the same boat. The only reason I used the > > "_syntax()" variant is that it was obviously wrong to do existence > > checks there. Once those are gone, then naturally it should be able to > > rely on is_root_ref() itself. > > This series hasn't been queued/merged yet, right? Do you plan to reroll > it? I think that the changes in there are a good complementary addition > to the clarifications in my patch series. Correct, I don't think Junio picked it up. It needed a re-roll anyway, so I'd plan to do it on top of your patches, assuming they are on track to get merged (and it sounds like there are no real objections). -Peff
On Wed, May 15, 2024 at 02:49:12AM -0400, Jeff King wrote: > On Wed, May 15, 2024 at 08:35:52AM +0200, Patrick Steinhardt wrote: > > > > Yeah, and I'd expect that the more-strict check_refname_format() that I > > > proposed elsewhere would be in the same boat. The only reason I used the > > > "_syntax()" variant is that it was obviously wrong to do existence > > > checks there. Once those are gone, then naturally it should be able to > > > rely on is_root_ref() itself. > > > > This series hasn't been queued/merged yet, right? Do you plan to reroll > > it? I think that the changes in there are a good complementary addition > > to the clarifications in my patch series. > > Correct, I don't think Junio picked it up. It needed a re-roll anyway, > so I'd plan to do it on top of your patches, assuming they are on track > to get merged (and it sounds like there are no real objections). That feels sensible. The series needs another thorough review and an Ack by somebody before Junio wants to merge it, but until now I'm not aware of any objections, yeah. So it should hopefully land soonish. Patrick
diff --git a/refs.c b/refs.c index 5b89e83ad7..ca9844bc3e 100644 --- a/refs.c +++ b/refs.c @@ -869,7 +869,10 @@ int is_root_ref(struct ref_store *refs, const char *refname) "NOTES_MERGE_REF", "MERGE_AUTOSTASH", }; - struct object_id oid; + struct strbuf referent = STRBUF_INIT; + struct object_id oid = { 0 }; + int failure_errno, ret = 0; + unsigned int flags; size_t i; if (!is_root_ref_syntax(refname)) @@ -877,30 +880,49 @@ int is_root_ref(struct ref_store *refs, const char *refname) if (is_headref(refs, refname)) return 1; + /* + * Note that we cannot use `refs_ref_exists()` here because that also + * checks whether its target ref exists in case refname is a symbolic + * ref. + */ if (ends_with(refname, "_HEAD")) { - refs_resolve_ref_unsafe(refs, refname, - RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE, - &oid, NULL); - return !is_null_oid(&oid); + ret = !refs_read_raw_ref(refs, refname, &oid, &referent, + &flags, &failure_errno); + goto done; } - for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++) + for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++) { if (!strcmp(refname, irregular_root_refs[i])) { - refs_resolve_ref_unsafe(refs, refname, - RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE, - &oid, NULL); - return !is_null_oid(&oid); + ret = !refs_read_raw_ref(refs, refname, &oid, &referent, + &flags, &failure_errno); + goto done; } + } - return 0; +done: + strbuf_release(&referent); + return ret; } int is_headref(struct ref_store *refs, const char *refname) { - if (!strcmp(refname, "HEAD")) - return refs_ref_exists(refs, refname); + struct strbuf referent = STRBUF_INIT; + struct object_id oid = { 0 }; + int failure_errno, ret = 0; + unsigned int flags; - return 0; + /* + * Note that we cannot use `refs_ref_exists()` here because that also + * checks whether its target ref exists in case refname is a symbolic + * ref. + */ + if (!strcmp(refname, "HEAD")) { + ret = !refs_read_raw_ref(refs, refname, &oid, &referent, + &flags, &failure_errno); + } + + strbuf_release(&referent); + return ret; } static int is_current_worktree_ref(const char *ref) { diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh index 948f1bb5f4..92ed8957c8 100755 --- a/t/t6302-for-each-ref-filter.sh +++ b/t/t6302-for-each-ref-filter.sh @@ -62,6 +62,23 @@ test_expect_success '--include-root-refs with other patterns' ' test_cmp expect actual ' +test_expect_success '--include-root-refs omits dangling symrefs' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + git symbolic-ref DANGLING_HEAD refs/heads/missing && + cat >expect <<-EOF && + HEAD + $(git symbolic-ref HEAD) + refs/tags/initial + EOF + git for-each-ref --format="%(refname)" --include-root-refs >actual && + test_cmp expect actual + ) +' + test_expect_success 'filtering with --points-at' ' cat >expect <<-\EOF && refs/heads/main
Before this patch series, root refs except for "HEAD" and our special refs were classified as pseudorefs. Furthermore, our terminology clarified that pseudorefs must not be symbolic refs. This restriction is enforced in `is_root_ref()`, which explicitly checks that a supposed root ref resolves to an object ID without recursing. This has been extremely confusing right from the start because (in old terminology) a ref name may sometimes be a pseudoref and sometimes not depending on whether it is a symbolic or regular ref. This behaviour does not seem reasonable at all and I very much doubt that it results in anything sane. Furthermore, the behaviour is different to `is_headref()`, which only checks for the ref to exist. While that is in line with our glossary, this inconsistency only adds to the confusion. Last but not least, the current behaviour can actually lead to a segfault when calling `is_root_ref()` with a reference that either does not exist or that is a symbolic ref because we never initialized `oid`. Let's loosen the restrictions in accordance to the new definition of root refs, which are simply plain refs that may as well be a symbolic ref. Consequently, we can just check for the ref to exist instead of requiring it to be a regular ref. Add a test that verifies that this does not change user-visible behaviour. Namely, we still don't want to show broken refs to the user by default in git-for-each-ref(1). What this does allow though is for internal callers to surface dangling root refs when they pass in the `DO_FOR_EACH_INCLUDE_BROKEN` flag. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- refs.c | 50 ++++++++++++++++++++++++---------- t/t6302-for-each-ref-filter.sh | 17 ++++++++++++ 2 files changed, 53 insertions(+), 14 deletions(-)