diff mbox series

[3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved

Message ID 88822afe950318c0312de5541a411942a163b139.1714398019.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series Clarify pseudo-ref terminology | expand

Commit Message

Patrick Steinhardt April 29, 2024, 1:41 p.m. UTC
The `is_pseudoref()` function has somewhat weird behaviour in that it
both checks whether a reference looks like a pseudoref, but also that
the reference actually resolves to an object ID.

In case a reference does not resolve though we can run into a segfault
because we never initialize the local `struct object_id` variable. Thus,
when `refs_resolve_ref_unsafe()` is unable to resolve the reference, the
variable will stay uninitialize. We then try to look up the hash algo
via the uninitialized value when calling `is_null_oid()`, which causes
us to segfault.

It is somewhat questionable in the first place that we declare a ref to
be a pseudorefe depending on whether it resolves to an object ID or not.
And to make things even worse, a symbolic ref is currently considered to
not be a pseudo ref either because of `RRESOLVE_REF_NO_RECURSE`, which
will cause us to not resolve them to an object ID. Last but not least,
it also is inconsistent with `is_headref()`, which only checks for the
reference to exist via `refs_ref_exists()`.

Refactor the code to do the same. While that still feels somewhat fishy,
it at least fixes the segfault for now. I have not been able to come up
with a reproducible test case that does not rely on other bugs and very
intricate state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

Comments

Phillip Wood April 29, 2024, 3:25 p.m. UTC | #1
Hi Patrick

On 29/04/2024 14:41, Patrick Steinhardt wrote:
> The `is_pseudoref()` function has somewhat weird behaviour in that it
> both checks whether a reference looks like a pseudoref, but also that
> the reference actually resolves to an object ID.
> 
> In case a reference does not resolve though we can run into a segfault
> because we never initialize the local `struct object_id` variable. Thus,
> when `refs_resolve_ref_unsafe()` is unable to resolve the reference, the
> variable will stay uninitialize. We then try to look up the hash algo

s/uninitialize/uninitialized/

> via the uninitialized value when calling `is_null_oid()`, which causes
> us to segfault.
> 
> It is somewhat questionable in the first place that we declare a ref to
> be a pseudorefe depending on whether it resolves to an object ID or not.

If I remember rightly Karthik added that check to avoid the files 
backend calling a file with a name that matched the pseudoref syntax a 
pseudoref when it wasn't actually a pseudoref.

> And to make things even worse, a symbolic ref is currently considered to
> not be a pseudo ref either because of `RRESOLVE_REF_NO_RECURSE`,

s/RR/R/

That was a deliberate choice to fit with the definition of pseudorefs 
excluding symbolic refs.

> which
> will cause us to not resolve them to an object ID. Last but not least,
> it also is inconsistent with `is_headref()`, which only checks for the
> reference to exist via `refs_ref_exists()`.
> 
> Refactor the code to do the same. While that still feels somewhat fishy,
> it at least fixes the segfault for now.

Alternatively we could call oidclr() when refs_resolve_refs_unsafe() 
returns NULL

Best Wishes

Phillip

> I have not been able to come up
> with a reproducible test case that does not rely on other bugs and very
> intricate state.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>   refs.c | 17 ++++-------------
>   1 file changed, 4 insertions(+), 13 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 567c6fc6ff..b35485f150 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -900,7 +900,6 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
>   		"NOTES_MERGE_REF",
>   		"MERGE_AUTOSTASH",
>   	};
> -	struct object_id oid;
>   	size_t i;
>   
>   	if (!is_pseudoref_syntax(refname))
> @@ -908,20 +907,12 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
>   	if (is_special_ref(refname))
>   		return 0;
>   
> -	if (ends_with(refname, "_HEAD")) {
> -		refs_resolve_ref_unsafe(refs, refname,
> -					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
> -					&oid, NULL);
> -		return !is_null_oid(&oid);
> -	}
> +	if (ends_with(refname, "_HEAD"))
> +		return refs_ref_exists(refs, refname);
>   
>   	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
> -		if (!strcmp(refname, irregular_pseudorefs[i])) {
> -			refs_resolve_ref_unsafe(refs, refname,
> -						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
> -						&oid, NULL);
> -			return !is_null_oid(&oid);
> -		}
> +		if (!strcmp(refname, irregular_pseudorefs[i]))
> +			return refs_ref_exists(refs, refname);
>   
>   	return 0;
>   }
karthik nayak April 29, 2024, 6:57 p.m. UTC | #2
> It is somewhat questionable in the first place that we declare a ref to
> be a pseudorefe depending on whether it resolves to an object ID or not.

s/pseudorefe/pseudoref

[snip]

Phillip Wood <phillip.wood123@gmail.com> writes:
>> via the uninitialized value when calling `is_null_oid()`, which causes
>> us to segfault.
>>
>> It is somewhat questionable in the first place that we declare a ref to
>> be a pseudorefe depending on whether it resolves to an object ID or not.
>
> If I remember rightly Karthik added that check to avoid the files
> backend calling a file with a name that matched the pseudoref syntax a
> pseudoref when it wasn't actually a pseudoref.

Not sure I follow. I think it was strictly done to ensure we don't
consider symrefs as pseudorefs [1].

[1]: https://lore.kernel.org/git/xmqqfrymeega.fsf@gitster.g/
Phillip Wood April 29, 2024, 7:47 p.m. UTC | #3
Hi Karthik

On 29/04/2024 19:57, Karthik Nayak wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
>>> via the uninitialized value when calling `is_null_oid()`, which causes
>>> us to segfault.
>>>
>>> It is somewhat questionable in the first place that we declare a ref to
>>> be a pseudorefe depending on whether it resolves to an object ID or not.
>>
>> If I remember rightly Karthik added that check to avoid the files
>> backend calling a file with a name that matched the pseudoref syntax a
>> pseudoref when it wasn't actually a pseudoref.
> 
> Not sure I follow. I think it was strictly done to ensure we don't
> consider symrefs as pseudorefs [1].

Junio suggested using refs_read_ref_unsafe() to ensure we don't consider 
symrefs as pseudorefs but your patch was already reading the ref to 
ensure it was not some random file whose name matches the pseudoref syntax.

Best Wishes

Phillip
karthik nayak April 29, 2024, 8:44 p.m. UTC | #4
Hey Phillip,

Phillip Wood <phillip.wood123@gmail.com> writes:

> Hi Karthik
>
> On 29/04/2024 19:57, Karthik Nayak wrote:
>> Phillip Wood <phillip.wood123@gmail.com> writes:
>>>> via the uninitialized value when calling `is_null_oid()`, which causes
>>>> us to segfault.
>>>>
>>>> It is somewhat questionable in the first place that we declare a ref to
>>>> be a pseudorefe depending on whether it resolves to an object ID or not.
>>>
>>> If I remember rightly Karthik added that check to avoid the files
>>> backend calling a file with a name that matched the pseudoref syntax a
>>> pseudoref when it wasn't actually a pseudoref.
>>
>> Not sure I follow. I think it was strictly done to ensure we don't
>> consider symrefs as pseudorefs [1].
>
> Junio suggested using refs_read_ref_unsafe() to ensure we don't consider
> symrefs as pseudorefs but your patch was already reading the ref to
> ensure it was not some random file whose name matches the pseudoref syntax.
>
> Best Wishes
>
> Phillip

Oh yes. You're absolutely correct. I just didn't understand what you
were referring to :)
Patrick Steinhardt April 30, 2024, 7:30 a.m. UTC | #5
On Mon, Apr 29, 2024 at 11:57:53AM -0700, Karthik Nayak wrote:
> > It is somewhat questionable in the first place that we declare a ref to
> > be a pseudorefe depending on whether it resolves to an object ID or not.
> 
> s/pseudorefe/pseudoref
> 
> [snip]
> 
> Phillip Wood <phillip.wood123@gmail.com> writes:
> >> via the uninitialized value when calling `is_null_oid()`, which causes
> >> us to segfault.
> >>
> >> It is somewhat questionable in the first place that we declare a ref to
> >> be a pseudorefe depending on whether it resolves to an object ID or not.
> >
> > If I remember rightly Karthik added that check to avoid the files
> > backend calling a file with a name that matched the pseudoref syntax a
> > pseudoref when it wasn't actually a pseudoref.
> 
> Not sure I follow. I think it was strictly done to ensure we don't
> consider symrefs as pseudorefs [1].
> 
> [1]: https://lore.kernel.org/git/xmqqfrymeega.fsf@gitster.g/

And that's fair from a terminology perspective. But honestly, I really
doubt that any user will understand that REBASE_HEAD is a pseudoref when
it contains an object ID, but is not a pseudoref when it is a symref.

Anyway, as I've said in parallel mails, I want to change the definition
of what a pseudoref is. I just think that the current mess is understood
by nobody and doesn't make any sense.

I'll thus implicitly address this in my v2.

Patrick
diff mbox series

Patch

diff --git a/refs.c b/refs.c
index 567c6fc6ff..b35485f150 100644
--- a/refs.c
+++ b/refs.c
@@ -900,7 +900,6 @@  int is_pseudoref(struct ref_store *refs, const char *refname)
 		"NOTES_MERGE_REF",
 		"MERGE_AUTOSTASH",
 	};
-	struct object_id oid;
 	size_t i;
 
 	if (!is_pseudoref_syntax(refname))
@@ -908,20 +907,12 @@  int is_pseudoref(struct ref_store *refs, const char *refname)
 	if (is_special_ref(refname))
 		return 0;
 
-	if (ends_with(refname, "_HEAD")) {
-		refs_resolve_ref_unsafe(refs, refname,
-					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-					&oid, NULL);
-		return !is_null_oid(&oid);
-	}
+	if (ends_with(refname, "_HEAD"))
+		return refs_ref_exists(refs, refname);
 
 	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
-		if (!strcmp(refname, irregular_pseudorefs[i])) {
-			refs_resolve_ref_unsafe(refs, refname,
-						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-						&oid, NULL);
-			return !is_null_oid(&oid);
-		}
+		if (!strcmp(refname, irregular_pseudorefs[i]))
+			return refs_ref_exists(refs, refname);
 
 	return 0;
 }