mbox series

[v5,0/8] Allow relative worktree linking to be configured by the user

Message ID 20241125-wt_relative_options-v5-0-356d122ff3db@pm.me (mailing list archive)
Headers show
Series Allow relative worktree linking to be configured by the user | expand

Message

Caleb White Nov. 26, 2024, 1:51 a.m. UTC
This series introduces the `--[no-]relative-paths` CLI option for
`git worktree {add, move, repair}` commands, as well as the
`worktree.useRelativePaths` configuration setting. When enabled,
these options allow worktrees to be linked using relative paths,
enhancing portability across environments where absolute paths
may differ (e.g., containerized setups, shared network drives).
Git still creates absolute paths by default, but these options allow
users to opt-in to relative paths if desired.

Using the `--relative-paths` option with `worktree {move, repair}`
will convert absolute paths to relative ones, while `--no-relative-paths`
does the reverse. For cases where users want consistency in path handling,
the config option `worktree.useRelativePaths` provides a persistent setting.

A new extension, `relativeWorktrees`, is added to indicate that at least
one worktree in the repository has been linked with relative paths. This
extension is automatically set when a worktree is created or repaired
using the `--relative-paths` option, or when the
`worktree.useRelativePaths` config is set to `true`.

The `relativeWorktrees` extension ensures older Git versions do not
attempt to automatically prune worktrees with relative paths, as they
would not not recognize the paths as being valid.

Signed-off-by: Caleb White <cdwhite3@pm.me>
---
The base for this patch series is 090d24e9af.

Link to original patch series:
https://lore.kernel.org/git/20241007-wt_relative_paths-v3-0-622cf18c45eb@pm.me

---
Changes in v5:
- Added docs to `--relative-paths` option.
- Added test coverage for `repair_worktrees()` and relative paths.
- Move `strbuf_reset` call in `infer_backlink()`.
- Cleaned up tests.
- Slight stylistic changes.
- Tweaked commit messages.
- Updated base to 090d24e9af.
- Link to v4: https://lore.kernel.org/r/20241031-wt_relative_options-v4-0-07a3dc0f02a3@pm.me
Changes in v4:
- Fixed failing test in ci
- Link to v3: https://lore.kernel.org/r/20241031-wt_relative_options-v3-0-3e44ccdf64e6@pm.me
Changes in v3:
- Split patches into smaller edits.
- Moved tests into the patches with the relevant code changes.
- Removed global `use_relative_paths` and instead pass parameter to functions.
- Changed `infer_backlink` return type from `int` to `ssize_t`.
- Updated `worktree.useRelativePaths` and `--relative-paths` descriptions.
- Reordered patches
- Link to v2: https://lore.kernel.org/r/20241028-wt_relative_options-v2-0-33a5021bd7bb@pm.me
Changes in v2:
- Fixed a bug where repositories with valid extensions would be downgraded
  to v0 during reinitialization, causing future operations to fail.
- Split patch [1/2] into three separate patches.
- Updated cover letter and commit messages.
- Updated documentation wording.
- Link to v1: https://lore.kernel.org/r/20241025-wt_relative_options-v1-0-c3005df76bf9@pm.me

---
Caleb White (8):
      setup: correctly reinitialize repository version
      worktree: add `relativeWorktrees` extension
      worktree: refactor infer_backlink return
      worktree: add `write_worktree_linking_files()` function
      worktree: add relative cli/config options to `add` command
      worktree: add relative cli/config options to `move` command
      worktree: add relative cli/config options to `repair` command
      worktree: refactor `repair_worktree_after_gitdir_move()`

 Documentation/config/extensions.txt |   6 ++
 Documentation/config/worktree.txt   |  10 +++
 Documentation/git-worktree.txt      |   8 ++
 builtin/worktree.c                  |  29 ++++---
 repository.c                        |   1 +
 repository.h                        |   1 +
 setup.c                             |  39 ++++++---
 setup.h                             |   1 +
 t/t0001-init.sh                     |  22 ++++-
 t/t2400-worktree-add.sh             |  45 +++++++++++
 t/t2401-worktree-prune.sh           |   3 +-
 t/t2402-worktree-list.sh            |  22 +++++
 t/t2403-worktree-move.sh            |  25 ++++++
 t/t2406-worktree-repair.sh          |  39 +++++++++
 t/t2408-worktree-relative.sh        |  39 ---------
 t/t5504-fetch-receive-strict.sh     |   6 +-
 worktree.c                          | 157 ++++++++++++++++++++----------------
 worktree.h                          |  22 ++++-
 18 files changed, 333 insertions(+), 142 deletions(-)
---
base-commit: 090d24e9af6e9f59c3f7bee97c42bb1ae3c7f559
change-id: 20241025-wt_relative_options-afa41987bc32

Best regards,

Comments

Junio C Hamano Nov. 26, 2024, 6:18 a.m. UTC | #1
Caleb White <cdwhite3@pm.me> writes:

> Changes in v5:
> - Added docs to `--relative-paths` option.

You already had doc on this, but the default was not described at
all.

 --[no-]relative-paths::
+       Link worktrees using relative paths or absolute paths (default).

> - Added test coverage for `repair_worktrees()` and relative paths.
> - Move `strbuf_reset` call in `infer_backlink()`.

This was more like "revert the change in v4 that moved it
unnecessarily", no?

> - Cleaned up tests.

Yup, there truely a lot of test changes between v4 and v5.  Many
tests now use existing test helpers, which is good.


> - Slight stylistic changes.

I saw many changes like these (the diff is between v4 and v5)

 static void repair_gitfile(struct worktree *wt,
-                          worktree_repair_fn fn,
-                          void *cb_data,
+                          worktree_repair_fn fn, void *cb_data,
                           int use_relative_paths)

which looked good (the original had fn and cb_data defined on the
same line).

> - Tweaked commit messages.

Updates to the proposed log message for `repair` step [7/8] did not
really "clarify", other than helping readers to see how messy things
are.  It said:

    +    To simplify things, both linking files are written when one of the files
    +    needs to be repaired. In some cases, this fixes the other file before it
    +    is checked, in other cases this results in a correct file being written
    +    with the same contents.

which may describe what the code happens to do correctly, but does
not quite help building the confidence in what it does is correct.

Suppose that the directory X has a repository, and the repository
thinks that the directory W is its worktree.  But the worktree at
the directory W thinks that its repository is not X but Y, and there
indeed is a repository at the directory Y.  That repository thinks W
belongs to it.

If we examine X first, would we end up updating W to point at X
(because X thinks W is its worktree)?

Or do we make W to point at Y (because Y thinks W is its, and W
thinks it is Y's)"?

Either way, I think the comment is trying to say that, if we decide
to make X and W belong to each other, we'd overwrite links from X to
W and also W to X, even though the link from X was already pointing
at W and the minimum fix we needed to make was to update the link
from W to point at X.  Overwriting a link from X to W with a new
link from X to W is a no-op, so it does not seem to help greatly,
since `repair` is not at all performance critical.  The correctness
is a lot more important.


> - Updated base to 090d24e9af.

This made it harder than necessary to compare the two iterations, by
the way.


Thanks.
Caleb White Nov. 26, 2024, 5:02 p.m. UTC | #2
On Tue Nov 26, 2024 at 12:18 AM CST, Junio C Hamano wrote:
> Caleb White <cdwhite3@pm.me> writes:
>> Changes in v5:
>> - Added docs to `--relative-paths` option.
>
> You already had doc on this, but the default was not described at
> all.
>
>  --[no-]relative-paths::
> +       Link worktrees using relative paths or absolute paths (default).

I added a bit more explanation instead of just directing the user to the
config variable (I originally had docs, but it was requested that
I remove the duplication and just point to the config, however, I think
the changes describes it a bit better as well as gives the default).

>> - Added test coverage for `repair_worktrees()` and relative paths.
>> - Move `strbuf_reset` call in `infer_backlink()`.
>
> This was more like "revert the change in v4 that moved it
> unnecessarily", no?

Yes, that is correct.

>> - Cleaned up tests.
>
> Yup, there truely a lot of test changes between v4 and v5.  Many
> tests now use existing test helpers, which is good.

This is the majority of the reroll. It seems like MacOS doesn't like the 
`test_config` helper inside of a subshell, so I had to stick with `git
config` in those cases.

>> - Slight stylistic changes.
>
> I saw many changes like these (the diff is between v4 and v5)
>
>  static void repair_gitfile(struct worktree *wt,
> -                          worktree_repair_fn fn,
> -                          void *cb_data,
> +                          worktree_repair_fn fn, void *cb_data,
>                            int use_relative_paths)
>
> which looked good (the original had fn and cb_data defined on the
> same line).

Yes, this was brought up in the previous review and I decided to make
the change.

>> - Tweaked commit messages.
>
> Updates to the proposed log message for `repair` step [7/8] did not
> really "clarify", other than helping readers to see how messy things
> are.  It said:
>
>     +    To simplify things, both linking files are written when one of the files
>     +    needs to be repaired. In some cases, this fixes the other file before it
>     +    is checked, in other cases this results in a correct file being written
>     +    with the same contents.
>
> which may describe what the code happens to do correctly, but does
> not quite help building the confidence in what it does is correct.
>
> Suppose that the directory X has a repository, and the repository
> thinks that the directory W is its worktree.  But the worktree at
> the directory W thinks that its repository is not X but Y, and there
> indeed is a repository at the directory Y.  That repository thinks W
> belongs to it.

That's a bit of a confusing scenario, but I think this is what you're
trying to describe:

    Repository X ----> Worktree W <---> Repository Y (Case 0)

which is not a normal case (but I'll get to that later).
Most of the time, a repair would be performed with one of the following
cases:

    Repository X <---> Worktree W (Case 1)
    Repository X ----> Worktree W (Case 2)
    Repository X <---- Worktree W (Case 3)
    Repository X       Worktree W (Case 4)

that is, a repository and worktree have valid links, have valid links in
one direction or the other, or they have no valid links at all.

Before I go on, I think it would be helpful to revisit how the repair
operation works. There are two loops in the repair operation:
1. the `repair()` function iterates over (via `repair_worktree_at_path()`)
   the given worktrees/paths (or `.` if no paths are given) to potentially
   repair the `<repo>/.git/worktrees/<worktree_id>/gitdir` files
2. the `repair_worktrees()` function iterates over all the worktrees
   defined at `<repo>/.git/worktrees/*` to potentially repair the
   `.../<worktree_id>/.git` files

In Loop 1, a repair is performed if:
- there's an absolute/relative path mismatch
- the worktree `.git` file points to the repository, but the repository
  `gitdir` file is unreadable or does not point back to the worktree
- the worktree `.git` file does NOT point to the repository, but an
  inferred backlink can be established (the worktree id in the `.git`
  file matches a worktree id in the repository's `worktrees` directory),
  and that inferred repository does not point to the worktree

In Loop 2, a repair is performed if:
- there's an absolute/relative path mismatch
- the worktree pointed to by the repository `gitdir` file does not point
  back to the repository or the file is corrupted

Now back to Cases 1--4:
- In Case 1, the repair would not update any links (already valid).
- Case 2 is most likely when using absolute paths and the repository is
  moved, but the worktree is not. The worktree `.git` will be updated
  during Loop 2, however, now the repository `gitdir` file will also be
  written with the same contents (a no-op) to keep the code simple.
- Case 3 is most likely when using absolute paths and the worktree is
  moved, but the repository is not. The repository `gitdir` will be
  updated during Loop 1, however, now the worktree `.git` file will also
  be written with the same contents (a no-op) to keep the code simple.
- Case 4 can occur when using absolute paths and both the repository and
  worktree are moved, but it can also occur when using relative paths and
  either the repository or worktree is moved. Both linking files need to
  be updated. In the past, the repository `gitdir` file would be updated
  during Loop 1 (from the inferred backlink), and the worktree `.git`
  would not be updated until Loop 2. However, now both linking files are
  updated during Loop 1 and the repair is complete by the time Loop 2
  is reached.

> If we examine X first, would we end up updating W to point at X
> (because X thinks W is its worktree)?
>
> Or do we make W to point at Y (because Y thinks W is its, and W
> thinks it is Y's)"?

A repair is always performed in the context of single repository,
therefore, if operating on repository X and the worktree W is found to
be a valid worktree for X, then yes, the repair would update the link
from W to X so that Case 0 would now look like:

    Repository X <---> Worktree W <---- Repository Y

but again, this is a very weird case---the most likely scenario that I
can think of is that a user copied a repository (with or without the
worktree). The `es/worktree-repair-copied` topic added support for
repairing a worktree from such a copy scenario. However, I did note[1,2]
that the topic added the ability for a repository to "take over" a
worktree from another repository if the worktree_id matched a worktree
inside the current repository. This can happen if two repositories use
the same worktree name (I usually name my worktrees the same name as the
branch to keep things simple so this can happen if two repositories
create a worktree for `master` for instance).

I recommended that worktrees be created with a unique hash/identifier so
that the worktree_id is unique across all repositories even if they have
the same name. I was planning on creating a future topic to address this,
for example creating a worktree `develop` would look like:

    foo/
    ├── .git/worktrees/develop-6b3d7b/
    └── develop/

The actual worktree directory name would still be `develop`, but the
worktree_id would be unique and prevent the "take over" scenario.

> Either way, I think the comment is trying to say that, if we decide
> to make X and W belong to each other, we'd overwrite links from X to
> W and also W to X, even though the link from X was already pointing
> at W and the minimum fix we needed to make was to update the link
> from W to point at X.  Overwriting a link from X to W with a new
> link from X to W is a no-op, so it does not seem to help greatly,
> since `repair` is not at all performance critical.  The correctness
> is a lot more important.

Yes, you understand this correctly. The repair operation is not
performance critical, so I decided to keep the code simple and just
always update both linking files. The same `write_worktree_linking_files()
is used for all operations (add, move, repair), some which require both
files to be updated, and others which only require one file to be updated.

>> - Updated base to 090d24e9af.
>
> This made it harder than necessary to compare the two iterations, by
> the way.

My apologies for that. I wasn't sure what the procedure was when
a dependent topic was merged to master. I figured it would be best to
rebase onto the latest master.

Best,

Caleb

[1]: https://lore.kernel.org/git/20241008153035.71178-1-cdwhite3@pm.me/
[2]: https://lore.kernel.org/git/r4zmcET41Skr_FMop47AKd7cms9E8bKPSvHuAUpnYavzKEY6JybJta0_7GfuYB0q-gD-XNcvh5VDTfiT3qthGKjqhS1sbT4M2lUABynOz2Q=@pm.me/