mbox series

[v3,0/5] Sparse Index: Integrate with 'git add'

Message ID pull.999.v3.git.1627570327.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Sparse Index: Integrate with 'git add' | expand

Message

Philippe Blain via GitGitGadget July 29, 2021, 2:52 p.m. UTC
This patch series re-submits the 'git add' integration with sparse-index.
The performance gains are the same as before.

It is based on ds/commit-and-checkout-with-sparse-index.

This series was delayed from its initial submission for a couple reasons.

The first was because it was colliding with some changes in
mt/add-rm-in-sparse-checkout, so now we are far enough along that that
branch is in our history and we can work forwards.

The other concern was about how 'git add ' should respond when a path
outside of the sparse-checkout cone exists. One recommendation (that I am
failing to find a link to the message, sorry) was to disallow adding files
that would become index entries with SKIP_WORKTREE on. However, as I worked
towards that goal I found that change would cause problems for a realistic
scenario: merge conflicts outside of the sparse-checkout cone.

Update: Elijah points out that the SKIP_WORKTREE bit is removed from
conflict files, which allows adding the conflicted files without warning.
(However, we also need to be careful about untracked files, as documented in
the test added here.)

The first patch of this series adds tests that create merge conflicts
outside of the sparse cone and then presents different ways a user could
resolve the situation. We want all of them to be feasible, and this
includes:

 1. Reverting the file to a known version in history.
 2. Adding the file with its contents on disk.
 3. Moving the file to a new location in the sparse directory.

The one place I did continue to update is 'git add --refresh ' to match the
behavior added by mt/add-rm-in-sparse-checkout which outputs an error
message. This happens even when the file exists in the working directory,
but that seems appropriate enough.


Updates in V3
=============

 * Added disclaimer to the merge-conflict test that this is documenting
   current behavior, not endorsing it.

 * Added Elijah's reviewed-by. Thanks for the review!

Thanks, -Stolee

Derrick Stolee (5):
  t1092: test merge conflicts outside cone
  add: allow operating on a sparse-only index
  pathspec: stop calling ensure_full_index
  add: ignore outside the sparse-checkout in refresh()
  add: remove ensure_full_index() with --renormalize

 builtin/add.c                            | 15 ++++--
 pathspec.c                               |  2 -
 t/t1092-sparse-checkout-compatibility.sh | 67 ++++++++++++++++++++----
 3 files changed, 70 insertions(+), 14 deletions(-)


base-commit: 71e301501c88399711a1bf8515d1747e92cfbb9b
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-999%2Fderrickstolee%2Fsparse-index%2Fadd-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-999/derrickstolee/sparse-index/add-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/999

Range-diff vs v2:

 1:  8f2fd9370fe ! 1:  5e96df4df58 t1092: test merge conflicts outside cone
     @@ Metadata
       ## Commit message ##
          t1092: test merge conflicts outside cone
      
     +    Conflicts can occur outside of the sparse-checkout definition, and in
     +    that case users might try to resolve the conflicts in several ways.
     +    Document a few of these ways in a test. Make it clear that this behavior
     +    is not necessarily the optimal flow, since users can become confused
     +    when Git deletes these files from the worktree in later commands.
     +
     +    Reviewed-by: Elijah Newren <newren@gmail.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## t/t1092-sparse-checkout-compatibility.sh ##
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'merge' '
       	test_all_match git rev-parse HEAD^{tree}
       '
       
     ++# NEEDSWORK: This test is documenting current behavior, but that
     ++# behavior can be confusing to users so there is desire to change it.
     ++# Right now, users might be using this flow to work through conflicts,
     ++# so any solution should present advice to users who try this sequence
     ++# of commands to follow whatever new method we create.
      +test_expect_success 'merge with conflict outside cone' '
      +	init_repos &&
      +
 2:  6e43f118fa0 ! 2:  defab1b86d3 add: allow operating on a sparse-only index
     @@ Commit message
          the use of a sparse index. We modify a test in t1092 to demonstrate
          these changes which will be remedied in future changes.
      
     +    Reviewed-by: Elijah Newren <newren@gmail.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## builtin/add.c ##
 3:  2ae91e0af29 ! 3:  9fc4313c889 pathspec: stop calling ensure_full_index
     @@ Commit message
          commits. Comparing to the full index case, we see the performance go
          from 0.33s to 0.05s, an 85% improvement.
      
     +    Reviewed-by: Elijah Newren <newren@gmail.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## pathspec.c ##
 4:  a79728d4c64 ! 4:  0ec03ab021d add: ignore outside the sparse-checkout in refresh()
     @@ Commit message
          tracked, untracked, or ignored. We simply avoid updating the stat()
          information because there isn't even an entry that matches the path!
      
     +    Reviewed-by: Elijah Newren <newren@gmail.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## builtin/add.c ##
 5:  1543550a4e8 ! 5:  adf5b15ac3d add: remove ensure_full_index() with --renormalize
     @@ Commit message
          SKIP_WORKTREE bit, so it will continue to do so with a sparse index
          because the sparse directory entries also have this bit set.
      
     +    Reviewed-by: Elijah Newren <newren@gmail.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## builtin/add.c ##

Comments

Elijah Newren July 29, 2021, 2:58 p.m. UTC | #1
On Thu, Jul 29, 2021 at 8:52 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This patch series re-submits the 'git add' integration with sparse-index.
> The performance gains are the same as before.
>
> It is based on ds/commit-and-checkout-with-sparse-index.
>
> This series was delayed from its initial submission for a couple reasons.
>
> The first was because it was colliding with some changes in
> mt/add-rm-in-sparse-checkout, so now we are far enough along that that
> branch is in our history and we can work forwards.
>
> The other concern was about how 'git add ' should respond when a path
> outside of the sparse-checkout cone exists. One recommendation (that I am
> failing to find a link to the message, sorry) was to disallow adding files
> that would become index entries with SKIP_WORKTREE on. However, as I worked
> towards that goal I found that change would cause problems for a realistic
> scenario: merge conflicts outside of the sparse-checkout cone.
>
> Update: Elijah points out that the SKIP_WORKTREE bit is removed from
> conflict files, which allows adding the conflicted files without warning.
> (However, we also need to be careful about untracked files, as documented in
> the test added here.)
>
> The first patch of this series adds tests that create merge conflicts
> outside of the sparse cone and then presents different ways a user could
> resolve the situation. We want all of them to be feasible, and this
> includes:
>
>  1. Reverting the file to a known version in history.
>  2. Adding the file with its contents on disk.
>  3. Moving the file to a new location in the sparse directory.
>
> The one place I did continue to update is 'git add --refresh ' to match the
> behavior added by mt/add-rm-in-sparse-checkout which outputs an error
> message. This happens even when the file exists in the working directory,
> but that seems appropriate enough.
>
>
> Updates in V3
> =============
>
>  * Added disclaimer to the merge-conflict test that this is documenting
>    current behavior, not endorsing it.
>
>  * Added Elijah's reviewed-by. Thanks for the review!

Yep, this one looks ready to merge down to me.  Thanks!

> Thanks, -Stolee
>
> Derrick Stolee (5):
>   t1092: test merge conflicts outside cone
>   add: allow operating on a sparse-only index
>   pathspec: stop calling ensure_full_index
>   add: ignore outside the sparse-checkout in refresh()
>   add: remove ensure_full_index() with --renormalize
>
>  builtin/add.c                            | 15 ++++--
>  pathspec.c                               |  2 -
>  t/t1092-sparse-checkout-compatibility.sh | 67 ++++++++++++++++++++----
>  3 files changed, 70 insertions(+), 14 deletions(-)
>
>
> base-commit: 71e301501c88399711a1bf8515d1747e92cfbb9b
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-999%2Fderrickstolee%2Fsparse-index%2Fadd-v3
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-999/derrickstolee/sparse-index/add-v3
> Pull-Request: https://github.com/gitgitgadget/git/pull/999
>
> Range-diff vs v2:
>
>  1:  8f2fd9370fe ! 1:  5e96df4df58 t1092: test merge conflicts outside cone
>      @@ Metadata
>        ## Commit message ##
>           t1092: test merge conflicts outside cone
>
>      +    Conflicts can occur outside of the sparse-checkout definition, and in
>      +    that case users might try to resolve the conflicts in several ways.
>      +    Document a few of these ways in a test. Make it clear that this behavior
>      +    is not necessarily the optimal flow, since users can become confused
>      +    when Git deletes these files from the worktree in later commands.
>      +
>      +    Reviewed-by: Elijah Newren <newren@gmail.com>
>           Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>
>        ## t/t1092-sparse-checkout-compatibility.sh ##
>      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'merge' '
>         test_all_match git rev-parse HEAD^{tree}
>        '
>
>      ++# NEEDSWORK: This test is documenting current behavior, but that
>      ++# behavior can be confusing to users so there is desire to change it.
>      ++# Right now, users might be using this flow to work through conflicts,
>      ++# so any solution should present advice to users who try this sequence
>      ++# of commands to follow whatever new method we create.
>       +test_expect_success 'merge with conflict outside cone' '
>       + init_repos &&
>       +
>  2:  6e43f118fa0 ! 2:  defab1b86d3 add: allow operating on a sparse-only index
>      @@ Commit message
>           the use of a sparse index. We modify a test in t1092 to demonstrate
>           these changes which will be remedied in future changes.
>
>      +    Reviewed-by: Elijah Newren <newren@gmail.com>
>           Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>
>        ## builtin/add.c ##
>  3:  2ae91e0af29 ! 3:  9fc4313c889 pathspec: stop calling ensure_full_index
>      @@ Commit message
>           commits. Comparing to the full index case, we see the performance go
>           from 0.33s to 0.05s, an 85% improvement.
>
>      +    Reviewed-by: Elijah Newren <newren@gmail.com>
>           Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>
>        ## pathspec.c ##
>  4:  a79728d4c64 ! 4:  0ec03ab021d add: ignore outside the sparse-checkout in refresh()
>      @@ Commit message
>           tracked, untracked, or ignored. We simply avoid updating the stat()
>           information because there isn't even an entry that matches the path!
>
>      +    Reviewed-by: Elijah Newren <newren@gmail.com>
>           Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>
>        ## builtin/add.c ##
>  5:  1543550a4e8 ! 5:  adf5b15ac3d add: remove ensure_full_index() with --renormalize
>      @@ Commit message
>           SKIP_WORKTREE bit, so it will continue to do so with a sparse index
>           because the sparse directory entries also have this bit set.
>
>      +    Reviewed-by: Elijah Newren <newren@gmail.com>
>           Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>
>        ## builtin/add.c ##
>
> --
> gitgitgadget
Junio C Hamano July 29, 2021, 11 p.m. UTC | #2
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Update: Elijah points out that the SKIP_WORKTREE bit is removed from
> conflict files, which allows adding the conflicted files without warning.
> (However, we also need to be careful about untracked files, as documented in
> the test added here.)
>
> The first patch of this series adds tests that create merge conflicts
> outside of the sparse cone and then presents different ways a user could
> resolve the situation. We want all of them to be feasible, and this
> includes:
>
>  1. Reverting the file to a known version in history.
>  2. Adding the file with its contents on disk.
>  3. Moving the file to a new location in the sparse directory.
>
> The one place I did continue to update is 'git add --refresh ' to match the
> behavior added by mt/add-rm-in-sparse-checkout which outputs an error
> message. This happens even when the file exists in the working directory,
> but that seems appropriate enough.
>
>
> Updates in V3
> =============
>
>  * Added disclaimer to the merge-conflict test that this is documenting
>    current behavior, not endorsing it.
>
>  * Added Elijah's reviewed-by. Thanks for the review!
>
> Thanks, -Stolee

Thanks, both.  Let's queue this and start merging it down.