Message ID | 88822afe950318c0312de5541a411942a163b139.1714398019.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Clarify pseudo-ref terminology | expand |
Hi Patrick On 29/04/2024 14:41, Patrick Steinhardt wrote: > The `is_pseudoref()` function has somewhat weird behaviour in that it > both checks whether a reference looks like a pseudoref, but also that > the reference actually resolves to an object ID. > > In case a reference does not resolve though we can run into a segfault > because we never initialize the local `struct object_id` variable. Thus, > when `refs_resolve_ref_unsafe()` is unable to resolve the reference, the > variable will stay uninitialize. We then try to look up the hash algo s/uninitialize/uninitialized/ > via the uninitialized value when calling `is_null_oid()`, which causes > us to segfault. > > It is somewhat questionable in the first place that we declare a ref to > be a pseudorefe depending on whether it resolves to an object ID or not. If I remember rightly Karthik added that check to avoid the files backend calling a file with a name that matched the pseudoref syntax a pseudoref when it wasn't actually a pseudoref. > And to make things even worse, a symbolic ref is currently considered to > not be a pseudo ref either because of `RRESOLVE_REF_NO_RECURSE`, s/RR/R/ That was a deliberate choice to fit with the definition of pseudorefs excluding symbolic refs. > which > will cause us to not resolve them to an object ID. Last but not least, > it also is inconsistent with `is_headref()`, which only checks for the > reference to exist via `refs_ref_exists()`. > > Refactor the code to do the same. While that still feels somewhat fishy, > it at least fixes the segfault for now. Alternatively we could call oidclr() when refs_resolve_refs_unsafe() returns NULL Best Wishes Phillip > I have not been able to come up > with a reproducible test case that does not rely on other bugs and very > intricate state. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > refs.c | 17 ++++------------- > 1 file changed, 4 insertions(+), 13 deletions(-) > > diff --git a/refs.c b/refs.c > index 567c6fc6ff..b35485f150 100644 > --- a/refs.c > +++ b/refs.c > @@ -900,7 +900,6 @@ int is_pseudoref(struct ref_store *refs, const char *refname) > "NOTES_MERGE_REF", > "MERGE_AUTOSTASH", > }; > - struct object_id oid; > size_t i; > > if (!is_pseudoref_syntax(refname)) > @@ -908,20 +907,12 @@ int is_pseudoref(struct ref_store *refs, const char *refname) > if (is_special_ref(refname)) > return 0; > > - if (ends_with(refname, "_HEAD")) { > - refs_resolve_ref_unsafe(refs, refname, > - RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE, > - &oid, NULL); > - return !is_null_oid(&oid); > - } > + if (ends_with(refname, "_HEAD")) > + return refs_ref_exists(refs, refname); > > for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++) > - if (!strcmp(refname, irregular_pseudorefs[i])) { > - refs_resolve_ref_unsafe(refs, refname, > - RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE, > - &oid, NULL); > - return !is_null_oid(&oid); > - } > + if (!strcmp(refname, irregular_pseudorefs[i])) > + return refs_ref_exists(refs, refname); > > return 0; > }
> It is somewhat questionable in the first place that we declare a ref to > be a pseudorefe depending on whether it resolves to an object ID or not. s/pseudorefe/pseudoref [snip] Phillip Wood <phillip.wood123@gmail.com> writes: >> via the uninitialized value when calling `is_null_oid()`, which causes >> us to segfault. >> >> It is somewhat questionable in the first place that we declare a ref to >> be a pseudorefe depending on whether it resolves to an object ID or not. > > If I remember rightly Karthik added that check to avoid the files > backend calling a file with a name that matched the pseudoref syntax a > pseudoref when it wasn't actually a pseudoref. Not sure I follow. I think it was strictly done to ensure we don't consider symrefs as pseudorefs [1]. [1]: https://lore.kernel.org/git/xmqqfrymeega.fsf@gitster.g/
Hi Karthik On 29/04/2024 19:57, Karthik Nayak wrote: > Phillip Wood <phillip.wood123@gmail.com> writes: >>> via the uninitialized value when calling `is_null_oid()`, which causes >>> us to segfault. >>> >>> It is somewhat questionable in the first place that we declare a ref to >>> be a pseudorefe depending on whether it resolves to an object ID or not. >> >> If I remember rightly Karthik added that check to avoid the files >> backend calling a file with a name that matched the pseudoref syntax a >> pseudoref when it wasn't actually a pseudoref. > > Not sure I follow. I think it was strictly done to ensure we don't > consider symrefs as pseudorefs [1]. Junio suggested using refs_read_ref_unsafe() to ensure we don't consider symrefs as pseudorefs but your patch was already reading the ref to ensure it was not some random file whose name matches the pseudoref syntax. Best Wishes Phillip
Hey Phillip, Phillip Wood <phillip.wood123@gmail.com> writes: > Hi Karthik > > On 29/04/2024 19:57, Karthik Nayak wrote: >> Phillip Wood <phillip.wood123@gmail.com> writes: >>>> via the uninitialized value when calling `is_null_oid()`, which causes >>>> us to segfault. >>>> >>>> It is somewhat questionable in the first place that we declare a ref to >>>> be a pseudorefe depending on whether it resolves to an object ID or not. >>> >>> If I remember rightly Karthik added that check to avoid the files >>> backend calling a file with a name that matched the pseudoref syntax a >>> pseudoref when it wasn't actually a pseudoref. >> >> Not sure I follow. I think it was strictly done to ensure we don't >> consider symrefs as pseudorefs [1]. > > Junio suggested using refs_read_ref_unsafe() to ensure we don't consider > symrefs as pseudorefs but your patch was already reading the ref to > ensure it was not some random file whose name matches the pseudoref syntax. > > Best Wishes > > Phillip Oh yes. You're absolutely correct. I just didn't understand what you were referring to :)
On Mon, Apr 29, 2024 at 11:57:53AM -0700, Karthik Nayak wrote: > > It is somewhat questionable in the first place that we declare a ref to > > be a pseudorefe depending on whether it resolves to an object ID or not. > > s/pseudorefe/pseudoref > > [snip] > > Phillip Wood <phillip.wood123@gmail.com> writes: > >> via the uninitialized value when calling `is_null_oid()`, which causes > >> us to segfault. > >> > >> It is somewhat questionable in the first place that we declare a ref to > >> be a pseudorefe depending on whether it resolves to an object ID or not. > > > > If I remember rightly Karthik added that check to avoid the files > > backend calling a file with a name that matched the pseudoref syntax a > > pseudoref when it wasn't actually a pseudoref. > > Not sure I follow. I think it was strictly done to ensure we don't > consider symrefs as pseudorefs [1]. > > [1]: https://lore.kernel.org/git/xmqqfrymeega.fsf@gitster.g/ And that's fair from a terminology perspective. But honestly, I really doubt that any user will understand that REBASE_HEAD is a pseudoref when it contains an object ID, but is not a pseudoref when it is a symref. Anyway, as I've said in parallel mails, I want to change the definition of what a pseudoref is. I just think that the current mess is understood by nobody and doesn't make any sense. I'll thus implicitly address this in my v2. Patrick
diff --git a/refs.c b/refs.c index 567c6fc6ff..b35485f150 100644 --- a/refs.c +++ b/refs.c @@ -900,7 +900,6 @@ int is_pseudoref(struct ref_store *refs, const char *refname) "NOTES_MERGE_REF", "MERGE_AUTOSTASH", }; - struct object_id oid; size_t i; if (!is_pseudoref_syntax(refname)) @@ -908,20 +907,12 @@ int is_pseudoref(struct ref_store *refs, const char *refname) if (is_special_ref(refname)) return 0; - if (ends_with(refname, "_HEAD")) { - refs_resolve_ref_unsafe(refs, refname, - RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE, - &oid, NULL); - return !is_null_oid(&oid); - } + if (ends_with(refname, "_HEAD")) + return refs_ref_exists(refs, refname); for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++) - if (!strcmp(refname, irregular_pseudorefs[i])) { - refs_resolve_ref_unsafe(refs, refname, - RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE, - &oid, NULL); - return !is_null_oid(&oid); - } + if (!strcmp(refname, irregular_pseudorefs[i])) + return refs_ref_exists(refs, refname); return 0; }
The `is_pseudoref()` function has somewhat weird behaviour in that it both checks whether a reference looks like a pseudoref, but also that the reference actually resolves to an object ID. In case a reference does not resolve though we can run into a segfault because we never initialize the local `struct object_id` variable. Thus, when `refs_resolve_ref_unsafe()` is unable to resolve the reference, the variable will stay uninitialize. We then try to look up the hash algo via the uninitialized value when calling `is_null_oid()`, which causes us to segfault. It is somewhat questionable in the first place that we declare a ref to be a pseudorefe depending on whether it resolves to an object ID or not. And to make things even worse, a symbolic ref is currently considered to not be a pseudo ref either because of `RRESOLVE_REF_NO_RECURSE`, which will cause us to not resolve them to an object ID. Last but not least, it also is inconsistent with `is_headref()`, which only checks for the reference to exist via `refs_ref_exists()`. Refactor the code to do the same. While that still feels somewhat fishy, it at least fixes the segfault for now. I have not been able to come up with a reproducible test case that does not rely on other bugs and very intricate state. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- refs.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-)