diff mbox series

[v2,01/10] Documentation/glossary: redefine pseudorefs as special refs

Message ID 2489bb558543d66592fb0f3eb4d4696ba8e31fea.1714479928.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series Clarify pseudo-ref terminology | expand

Commit Message

Patrick Steinhardt April 30, 2024, 12:26 p.m. UTC
Nowadays, Git knows about three different kinds of refs. As defined in
gitglossary(7):

  - Regular refs that start with "refs/", like "refs/heads/main".

  - Pseudorefs, which live in the root directory. These must have
    all-caps names and must be a file that start with an object hash.
    Consequently, symbolic refs are not pseudorefs because they do not
    start with an object hash.

  - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".

This state is extremely confusing, and I would claim that most folks
don't fully understand what is what here. The current definitions also
have several problems:

  - Where does "HEAD" fit in? It's not a pseudoref because it can be
    a symbolic ref. It's not a regular ref because it does not start
    with "refs/". And it's not a special ref, either.

  - There is a strong overlap between pseudorefs and special refs. The
    pseudoref section for example mentions "MERGE_HEAD", even though it
    is a special ref. Is it thus both a pseudoref and a special ref?

  - Why do we even need to distinguish refs that live in the root from
    other refs when they behave just like a regular ref anyway?

In other words, the current state is quite a mess and leads to wild
inconsistencies without much of a good reason.

The original reason why pseudorefs were introduced is that there are
some refs that sometimes behave like a ref, even though they aren't a
ref. And we really only have two of these nowadads, namely "MERGE_HEAD"
and "FETCH_HEAD". Those files are never written via the ref backends,
but are instead written by git-fetch(1), git-pull(1) and git-merge(1).
They contain additional metadata that hihlights where a ref has been
fetched from or the list of commits that have been merged.

This original intent in fact matches the definition of special refs that
we have recently introduced in 8df4c5d205 (Documentation: add "special
refs" to the glossary, 2024-01-19). Due to the introduction of the new
reftable backend we were forced to distinguish those refs more clearly
such that we don't ever try to read or write them via the reftable
backend. In the same series, we also addressed all the other cases where
we used to write those special refs via the filesystem directly, thus
circumventing the ref backend, to instead write them via the backends.
Consequently, there are no other refs left anymore which are special.

Let's address this mess and return the pseudoref terminology back to its
original intent: a ref that sometimes behave like a ref, but which isn't
really a ref because it gets written to the filesystem directly. Or in
other words, let's redefine pseudorefs to match the current definition
of special refs. As special refs and pseudorefs are now the same per
definition, we can drop the "special refs" term again. It's not exposed
to our users and thus they wouldn't ever encounter that term anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 42 +++++++++---------------------
 1 file changed, 13 insertions(+), 29 deletions(-)

Comments

karthik nayak April 30, 2024, 12:49 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> Nowadays, Git knows about three different kinds of refs. As defined in
> gitglossary(7):
>
>   - Regular refs that start with "refs/", like "refs/heads/main".
>
>   - Pseudorefs, which live in the root directory. These must have
>     all-caps names and must be a file that start with an object hash.
>     Consequently, symbolic refs are not pseudorefs because they do not
>     start with an object hash.
>
>   - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".
>

Nit: but since you go into explaining what the _old_ pseudoref is,
perhaps you should also add a line about why "FETCH_HEAD" and
"MERGE_HEAD" were called special refs.

> This state is extremely confusing, and I would claim that most folks
> don't fully understand what is what here. The current definitions also
> have several problems:
>
>   - Where does "HEAD" fit in? It's not a pseudoref because it can be
>     a symbolic ref. It's not a regular ref because it does not start
>     with "refs/". And it's not a special ref, either.
>
>   - There is a strong overlap between pseudorefs and special refs. The
>     pseudoref section for example mentions "MERGE_HEAD", even though it
>     is a special ref. Is it thus both a pseudoref and a special ref?
>
>   - Why do we even need to distinguish refs that live in the root from
>     other refs when they behave just like a regular ref anyway?
>
> In other words, the current state is quite a mess and leads to wild
> inconsistencies without much of a good reason.
>
> The original reason why pseudorefs were introduced is that there are
> some refs that sometimes behave like a ref, even though they aren't a
> ref. And we really only have two of these nowadads, namely "MERGE_HEAD"
> and "FETCH_HEAD". Those files are never written via the ref backends,
> but are instead written by git-fetch(1), git-pull(1) and git-merge(1).
> They contain additional metadata that hihlights where a ref has been

s/hihlights/highlights

> fetched from or the list of commits that have been merged.

This is good detail and I guess you can skip my earlier suggestion.

> This original intent in fact matches the definition of special refs that
> we have recently introduced in 8df4c5d205 (Documentation: add "special
> refs" to the glossary, 2024-01-19). Due to the introduction of the new
> reftable backend we were forced to distinguish those refs more clearly
> such that we don't ever try to read or write them via the reftable
> backend. In the same series, we also addressed all the other cases where
> we used to write those special refs via the filesystem directly, thus
> circumventing the ref backend, to instead write them via the backends.
> Consequently, there are no other refs left anymore which are special.
>
> Let's address this mess and return the pseudoref terminology back to its
> original intent: a ref that sometimes behave like a ref, but which isn't
> really a ref because it gets written to the filesystem directly. Or in
> other words, let's redefine pseudorefs to match the current definition
> of special refs. As special refs and pseudorefs are now the same per
> definition, we can drop the "special refs" term again. It's not exposed
> to our users and thus they wouldn't ever encounter that term anyway.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/glossary-content.txt | 42 +++++++++---------------------
>  1 file changed, 13 insertions(+), 29 deletions(-)
>
> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index d71b199955..f5c0f49150 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -496,21 +496,19 @@ exclude;;
>  	that start with `refs/bisect/`, but might later include other
>  	unusual refs.
>
> -[[def_pseudoref]]pseudoref::
> -	Pseudorefs are a class of files under `$GIT_DIR` which behave
> -	like refs for the purposes of rev-parse, but which are treated
> -	specially by git.  Pseudorefs both have names that are all-caps,
> -	and always start with a line consisting of a
> -	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
> -	pseudoref, because it is sometimes a symbolic ref.  They might
> -	optionally contain some additional data.  `MERGE_HEAD` and
> -	`CHERRY_PICK_HEAD` are examples.  Unlike
> -	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
> -	be symbolic refs, and never have reflogs.  They also cannot be
> -	updated through the normal ref update machinery.  Instead,
> -	they are updated by directly writing to the files.  However,
> -	they can be read as if they were refs, so `git rev-parse
> -	MERGE_HEAD` will work.
> +[[def_pseudoref]]pseudoref ref::

shouldn't this just be 'pseudoref'?

> +	A ref that has different semantics than normal refs. These refs can be
> +	accessed via normal Git commands but may not behave the same as a
> +	normal ref in some cases.
> ++
> +The following pseudorefs are known to Git:
> +
> + - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
> +   may refer to multiple object IDs. Each object ID is annotated with metadata
> +   indicating where it was fetched from and its fetch status.
> +
> + - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
> +   conflicts. It contains all commit IDs which are being merged.
>
>  [[def_pull]]pull::
>  	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
> @@ -638,20 +636,6 @@ The most notable example is `HEAD`.
>  	An <<def_object,object>> used to temporarily store the contents of a
>  	<<def_dirty,dirty>> working directory and the index for future reuse.
>
> -[[def_special_ref]]special ref::
> -	A ref that has different semantics than normal refs. These refs can be
> -	accessed via normal Git commands but may not behave the same as a
> -	normal ref in some cases.
> -+
> -The following special refs are known to Git:
> -
> - - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
> -   may refer to multiple object IDs. Each object ID is annotated with metadata
> -   indicating where it was fetched from and its fetch status.
> -
> - - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
> -   conflicts. It contains all commit IDs which are being merged.
> -
>  [[def_submodule]]submodule::
>  	A <<def_repository,repository>> that holds the history of a
>  	separate project inside another repository (the latter of
> --
> 2.45.0
Justin Tobler April 30, 2024, 5:17 p.m. UTC | #2
On 24/04/30 02:26PM, Patrick Steinhardt wrote:
> Nowadays, Git knows about three different kinds of refs. As defined in
> gitglossary(7):
> 
>   - Regular refs that start with "refs/", like "refs/heads/main".
> 
>   - Pseudorefs, which live in the root directory. These must have
>     all-caps names and must be a file that start with an object hash.
>     Consequently, symbolic refs are not pseudorefs because they do not
>     start with an object hash.
> 
>   - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".
> 
> This state is extremely confusing, and I would claim that most folks
> don't fully understand what is what here. The current definitions also
> have several problems:
> 
>   - Where does "HEAD" fit in? It's not a pseudoref because it can be
>     a symbolic ref. It's not a regular ref because it does not start
>     with "refs/". And it's not a special ref, either.
> 
>   - There is a strong overlap between pseudorefs and special refs. The
>     pseudoref section for example mentions "MERGE_HEAD", even though it
>     is a special ref. Is it thus both a pseudoref and a special ref?
> 
>   - Why do we even need to distinguish refs that live in the root from
>     other refs when they behave just like a regular ref anyway?
> 
> In other words, the current state is quite a mess and leads to wild
> inconsistencies without much of a good reason.
> 
> The original reason why pseudorefs were introduced is that there are
> some refs that sometimes behave like a ref, even though they aren't a
> ref. And we really only have two of these nowadads, namely "MERGE_HEAD"

s/nowadads/nowadays/

-Justin
Junio C Hamano April 30, 2024, 8:12 p.m. UTC | #3
Patrick Steinhardt <ps@pks.im> writes:

> Let's address this mess and return the pseudoref terminology back to its
> original intent: a ref that sometimes behave like a ref, but which isn't
> really a ref because it gets written to the filesystem directly. Or in
> other words, let's redefine pseudorefs to match the current definition
> of special refs. As special refs and pseudorefs are now the same per
> definition, we can drop the "special refs" term again. It's not exposed
> to our users and thus they wouldn't ever encounter that term anyway.

Good intentions.

I do not agree with "the ones at the root should not be special" at
all, though.  We need to reject names like 'config' somehow, as long
as there are users who use files backend.
Patrick Steinhardt May 2, 2024, 8:07 a.m. UTC | #4
On Tue, Apr 30, 2024 at 01:12:33PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Let's address this mess and return the pseudoref terminology back to its
> > original intent: a ref that sometimes behave like a ref, but which isn't
> > really a ref because it gets written to the filesystem directly. Or in
> > other words, let's redefine pseudorefs to match the current definition
> > of special refs. As special refs and pseudorefs are now the same per
> > definition, we can drop the "special refs" term again. It's not exposed
> > to our users and thus they wouldn't ever encounter that term anyway.
> 
> Good intentions.
> 
> I do not agree with "the ones at the root should not be special" at
> all, though.  We need to reject names like 'config' somehow, as long
> as there are users who use files backend.

Oh, yes, I totally agree and thought I'd mentioned this in the message.
But it seems like I only mention this in a subsequent message. Let me
add a hint to the commit message that mentions that a subsequent commit
will clearly define "root refs".

In any case, root refs should not be special regarding their behaviour,
but should have a strict naming schema:

    - Only uppercase letters or underscores.

    - Must end with "_HEAD" or be called "HEAD".

    - There is an exhaustive list of legacy root refs that don't conform
      to this naming schema, like "AUTO_MERGE". This list shall not be
      extended in the future.

This explanation is added in patch 3.

Patrick
diff mbox series

Patch

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index d71b199955..f5c0f49150 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -496,21 +496,19 @@  exclude;;
 	that start with `refs/bisect/`, but might later include other
 	unusual refs.
 
-[[def_pseudoref]]pseudoref::
-	Pseudorefs are a class of files under `$GIT_DIR` which behave
-	like refs for the purposes of rev-parse, but which are treated
-	specially by git.  Pseudorefs both have names that are all-caps,
-	and always start with a line consisting of a
-	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
-	pseudoref, because it is sometimes a symbolic ref.  They might
-	optionally contain some additional data.  `MERGE_HEAD` and
-	`CHERRY_PICK_HEAD` are examples.  Unlike
-	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
-	be symbolic refs, and never have reflogs.  They also cannot be
-	updated through the normal ref update machinery.  Instead,
-	they are updated by directly writing to the files.  However,
-	they can be read as if they were refs, so `git rev-parse
-	MERGE_HEAD` will work.
+[[def_pseudoref]]pseudoref ref::
+	A ref that has different semantics than normal refs. These refs can be
+	accessed via normal Git commands but may not behave the same as a
+	normal ref in some cases.
++
+The following pseudorefs are known to Git:
+
+ - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
+   may refer to multiple object IDs. Each object ID is annotated with metadata
+   indicating where it was fetched from and its fetch status.
+
+ - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
+   conflicts. It contains all commit IDs which are being merged.
 
 [[def_pull]]pull::
 	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
@@ -638,20 +636,6 @@  The most notable example is `HEAD`.
 	An <<def_object,object>> used to temporarily store the contents of a
 	<<def_dirty,dirty>> working directory and the index for future reuse.
 
-[[def_special_ref]]special ref::
-	A ref that has different semantics than normal refs. These refs can be
-	accessed via normal Git commands but may not behave the same as a
-	normal ref in some cases.
-+
-The following special refs are known to Git:
-
- - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
-   may refer to multiple object IDs. Each object ID is annotated with metadata
-   indicating where it was fetched from and its fetch status.
-
- - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
-   conflicts. It contains all commit IDs which are being merged.
-
 [[def_submodule]]submodule::
 	A <<def_repository,repository>> that holds the history of a
 	separate project inside another repository (the latter of