diff mbox series

[06/13] add: skip paths that are outside sparse-checkout cone

Message ID eeba97ad492307302637faf33f6bf6ae8965faa3.1629842085.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior | expand

Commit Message

Derrick Stolee Aug. 24, 2021, 9:54 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

When 'git add' adds a tracked file that is outside of the
sparse-checkout cone, it checks the SKIP_WORKTREE bit to see if the file
exists outside of the sparse-checkout cone. This is usually correct,
except in the case of a merge conflict outside of the cone.

Modify add_pathspec_matched_against_index() to be more careful about
pathes by checking the sparse-checkout patterns in addition to the
SKIP_WORKTREE bit. This causes 'git add' to no longer allow files
outside of the cone that removed the SKIP_WORKTREE bit due to a merge
conflict.

With only this change, users will only be able to add the file after
adding the file to the sparse-checkout cone. A later change will allow
users to force adding even though the file is outside of the
sparse-checkout cone.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 pathspec.c                               | 3 ++-
 t/t1092-sparse-checkout-compatibility.sh | 8 ++++----
 2 files changed, 6 insertions(+), 5 deletions(-)

Comments

Matheus Tavares Aug. 27, 2021, 9:13 p.m. UTC | #1
On Tue, Aug 24, 2021 at 6:54 PM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
>
> Subject: [PATCH 06/13] add: skip paths that are outside sparse-checkout cone

Perhaps this could be "skip _tracked_ paths that ..." (to help
differentiate the end goal of this patch from the previous one).

> diff --git a/pathspec.c b/pathspec.c
> index 44306fdaca2..0e6e60fdc5a 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -39,7 +39,8 @@ void add_pathspec_matches_against_index(const struct pathspec *pathspec,
>                 return;
>         for (i = 0; i < istate->cache_nr; i++) {
>                 const struct cache_entry *ce = istate->cache[i];
> -               if (sw_action == PS_IGNORE_SKIP_WORKTREE && ce_skip_worktree(ce))
> +               if (sw_action == PS_IGNORE_SKIP_WORKTREE &&
> +                   (ce_skip_worktree(ce) || !path_in_sparse_checkout(ce->name, istate)))

Hmm, even though we skip the sparse paths here, cmd_add() will call
add_files_to_cache() at the end and still update these paths in the
index. I think there are two ways to fix this. We could either change
run_diff_files() to skip these paths (but I don't know how other callers
of this functions want to handle this, so maybe this needs to hide
behind an option flag):

diff --git a/diff-lib.c b/diff-lib.c
index f9eadc4fc1..4245d7ead5 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -198,7 +198,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 				continue;
 		}
 
-		if (ce_uptodate(ce) || ce_skip_worktree(ce))
+		if (ce_uptodate(ce) || ce_skip_worktree(ce) ||
+		    !path_in_sparse_checkout(ce->name, istate))
 			continue;
 
 		/*

Or we could change the callback in add itself:

diff --git a/builtin/add.c b/builtin/add.c
index f675bdeae4..3d7762aac2 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -94,6 +94,10 @@ static void update_callback(struct diff_queue_struct *q,
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		const char *path = p->one->path;
+
+		if (!path_in_sparse_checkout(path, &the_index))
+			continue;
+
 		switch (fix_unmerged_status(p, data)) {
 		default:
 			die(_("unexpected diff status %c"), p->status);

I believe we also need to update a few other places to use the
`(ce_skip_worktree(ce) || !path_in_sparse_checkout())` logic in order to
avoid updating tracked sparse paths: chmod_pathspec() for add's --chmod
option, renormalize_tracked_files() for --renormalize, and
read-cache.c:refresh_index() for --refresh.

>                         continue;
>                 ce_path_match(istate, ce, pathspec, seen);
>         }

Hmm, don't we also want to update
find_pathspecs_matching_skip_worktree() in this patch to use
path_in_sparse_checkout()? I see you did that in patch 8, but I think
this should be together with this current patch as, without it, we stop
adding tracked sparse paths but we print no error/advice message about
it.
Derrick Stolee Sept. 8, 2021, 7:46 p.m. UTC | #2
On 8/27/2021 5:13 PM, Matheus Tavares wrote:
> On Tue, Aug 24, 2021 at 6:54 PM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
>>
>> Subject: [PATCH 06/13] add: skip paths that are outside sparse-checkout cone
> 
> Perhaps this could be "skip _tracked_ paths that ..." (to help
> differentiate the end goal of this patch from the previous one).
> 
>> diff --git a/pathspec.c b/pathspec.c
>> index 44306fdaca2..0e6e60fdc5a 100644
>> --- a/pathspec.c
>> +++ b/pathspec.c
>> @@ -39,7 +39,8 @@ void add_pathspec_matches_against_index(const struct pathspec *pathspec,
>>                 return;
>>         for (i = 0; i < istate->cache_nr; i++) {
>>                 const struct cache_entry *ce = istate->cache[i];
>> -               if (sw_action == PS_IGNORE_SKIP_WORKTREE && ce_skip_worktree(ce))
>> +               if (sw_action == PS_IGNORE_SKIP_WORKTREE &&
>> +                   (ce_skip_worktree(ce) || !path_in_sparse_checkout(ce->name, istate)))
> 
> Hmm, even though we skip the sparse paths here, cmd_add() will call
> add_files_to_cache() at the end and still update these paths in the
> index. I think there are two ways to fix this. We could either change
> run_diff_files() to skip these paths (but I don't know how other callers
> of this functions want to handle this, so maybe this needs to hide
> behind an option flag):

You are absolutely right to point this out. I had missed this interaction.
But, this is also already broken. The patch below adds a check to show that
'git add' does not add the sparse_entry, but it does (even when applied
before any patch in this series). That is: all the modified tests fail
after this change. I'll work to fix this issue before the next version of
this series.

Thanks,
-Stolee

--- >8 ---

From 21dab466d221e8632d98553f5f1fa900a2d47c7f Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Wed, 8 Sep 2021 15:40:32 -0400
Subject: [PATCH] t3705: test that 'sparse_entry' is unstaged

The tests in t3705-add-sparse-checkout.sh check to see how 'git add'
behaves with paths outside the sparse-checkout definition. These
currently check to see if a given warning is present but not that the
index is not updated with the sparse entries. Add a new
'test_sparse_entry_unstaged' helper to be sure 'git add' is behaving
correctly.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t3705-add-sparse-checkout.sh | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/t/t3705-add-sparse-checkout.sh b/t/t3705-add-sparse-checkout.sh
index 2b1fd0d0eef..d31a0d4f550 100755
--- a/t/t3705-add-sparse-checkout.sh
+++ b/t/t3705-add-sparse-checkout.sh
@@ -36,6 +36,11 @@ setup_gitignore () {
 	EOF
 }
 
+test_sparse_entry_unstaged () {
+	git status --porcelain >actual &&
+	! grep "A  sparse_entry" actual
+}
+
 test_expect_success 'setup' "
 	cat >sparse_error_header <<-EOF &&
 	The following pathspecs didn't match any eligible path, but they do match index
@@ -55,6 +60,7 @@ test_expect_success 'git add does not remove sparse entries' '
 	setup_sparse_entry &&
 	rm sparse_entry &&
 	test_must_fail git add sparse_entry 2>stderr &&
+	test_sparse_entry_unstaged &&
 	test_cmp error_and_hint stderr &&
 	test_sparse_entry_unchanged
 '
@@ -73,6 +79,7 @@ test_expect_success 'git add . does not remove sparse entries' '
 	rm sparse_entry &&
 	setup_gitignore &&
 	test_must_fail git add . 2>stderr &&
+	test_sparse_entry_unstaged &&
 
 	cat sparse_error_header >expect &&
 	echo . >>expect &&
@@ -88,6 +95,7 @@ do
 		setup_sparse_entry &&
 		echo modified >sparse_entry &&
 		test_must_fail git add $opt sparse_entry 2>stderr &&
+		test_sparse_entry_unstaged &&
 		test_cmp error_and_hint stderr &&
 		test_sparse_entry_unchanged
 	'
@@ -98,6 +106,7 @@ test_expect_success 'git add --refresh does not update sparse entries' '
 	git ls-files --debug sparse_entry | grep mtime >before &&
 	test-tool chmtime -60 sparse_entry &&
 	test_must_fail git add --refresh sparse_entry 2>stderr &&
+	test_sparse_entry_unstaged &&
 	test_cmp error_and_hint stderr &&
 	git ls-files --debug sparse_entry | grep mtime >after &&
 	test_cmp before after
@@ -106,6 +115,7 @@ test_expect_success 'git add --refresh does not update sparse entries' '
 test_expect_success 'git add --chmod does not update sparse entries' '
 	setup_sparse_entry &&
 	test_must_fail git add --chmod=+x sparse_entry 2>stderr &&
+	test_sparse_entry_unstaged &&
 	test_cmp error_and_hint stderr &&
 	test_sparse_entry_unchanged &&
 	! test -x sparse_entry
@@ -116,6 +126,7 @@ test_expect_success 'git add --renormalize does not update sparse entries' '
 	setup_sparse_entry "LINEONE\r\nLINETWO\r\n" &&
 	echo "sparse_entry text=auto" >.gitattributes &&
 	test_must_fail git add --renormalize sparse_entry 2>stderr &&
+	test_sparse_entry_unstaged &&
 	test_cmp error_and_hint stderr &&
 	test_sparse_entry_unchanged
 '
@@ -124,6 +135,7 @@ test_expect_success 'git add --dry-run --ignore-missing warn on sparse path' '
 	setup_sparse_entry &&
 	rm sparse_entry &&
 	test_must_fail git add --dry-run --ignore-missing sparse_entry 2>stderr &&
+	test_sparse_entry_unstaged &&
 	test_cmp error_and_hint stderr &&
 	test_sparse_entry_unchanged
 '
@@ -148,6 +160,7 @@ test_expect_success 'do not warn when pathspec matches dense entries' '
 test_expect_success 'add obeys advice.updateSparsePath' '
 	setup_sparse_entry &&
 	test_must_fail git -c advice.updateSparsePath=false add sparse_entry 2>stderr &&
+	test_sparse_entry_unstaged &&
 	test_cmp sparse_entry_error stderr
 
 '
Derrick Stolee Sept. 8, 2021, 8:02 p.m. UTC | #3
On 9/8/2021 3:46 PM, Derrick Stolee wrote:
> On 8/27/2021 5:13 PM, Matheus Tavares wrote:
>> On Tue, Aug 24, 2021 at 6:54 PM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
>>>
>>> Subject: [PATCH 06/13] add: skip paths that are outside sparse-checkout cone
>>
>> Perhaps this could be "skip _tracked_ paths that ..." (to help
>> differentiate the end goal of this patch from the previous one).
>>
>>> diff --git a/pathspec.c b/pathspec.c
>>> index 44306fdaca2..0e6e60fdc5a 100644
>>> --- a/pathspec.c
>>> +++ b/pathspec.c
>>> @@ -39,7 +39,8 @@ void add_pathspec_matches_against_index(const struct pathspec *pathspec,
>>>                 return;
>>>         for (i = 0; i < istate->cache_nr; i++) {
>>>                 const struct cache_entry *ce = istate->cache[i];
>>> -               if (sw_action == PS_IGNORE_SKIP_WORKTREE && ce_skip_worktree(ce))
>>> +               if (sw_action == PS_IGNORE_SKIP_WORKTREE &&
>>> +                   (ce_skip_worktree(ce) || !path_in_sparse_checkout(ce->name, istate)))
>>
>> Hmm, even though we skip the sparse paths here, cmd_add() will call
>> add_files_to_cache() at the end and still update these paths in the
>> index. I think there are two ways to fix this. We could either change
>> run_diff_files() to skip these paths (but I don't know how other callers
>> of this functions want to handle this, so maybe this needs to hide
>> behind an option flag):
> 
> You are absolutely right to point this out. I had missed this interaction.
> But, this is also already broken. The patch below adds a check to show that
> 'git add' does not add the sparse_entry, but it does (even when applied
> before any patch in this series). That is: all the modified tests fail
> after this change. I'll work to fix this issue before the next version of
> this series.

Of course, the reason for the failures is because the 'sparse_entry' is
staged as part of setup_sparse_entry. Not a bug.

This makes things more difficult to test, so I'll look around for an
alternative way to test that 'git add' is behaving correctly.

Thanks,
-Stolee
Derrick Stolee Sept. 8, 2021, 9:06 p.m. UTC | #4
On 8/27/2021 5:13 PM, Matheus Tavares wrote:
> On Tue, Aug 24, 2021 at 6:54 PM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
>>
>> Subject: [PATCH 06/13] add: skip paths that are outside sparse-checkout cone
> 
> Perhaps this could be "skip _tracked_ paths that ..." (to help
> differentiate the end goal of this patch from the previous one).
> 
>> diff --git a/pathspec.c b/pathspec.c
>> index 44306fdaca2..0e6e60fdc5a 100644
>> --- a/pathspec.c
>> +++ b/pathspec.c
>> @@ -39,7 +39,8 @@ void add_pathspec_matches_against_index(const struct pathspec *pathspec,
>>                 return;
>>         for (i = 0; i < istate->cache_nr; i++) {
>>                 const struct cache_entry *ce = istate->cache[i];
>> -               if (sw_action == PS_IGNORE_SKIP_WORKTREE && ce_skip_worktree(ce))
>> +               if (sw_action == PS_IGNORE_SKIP_WORKTREE &&
>> +                   (ce_skip_worktree(ce) || !path_in_sparse_checkout(ce->name, istate)))
> 
> Hmm, even though we skip the sparse paths here, cmd_add() will call
> add_files_to_cache() at the end and still update these paths in the
> index. I think there are two ways to fix this. We could either change
> run_diff_files() to skip these paths (but I don't know how other callers
> of this functions want to handle this, so maybe this needs to hide
> behind an option flag):

Ok, this sent me off on a tangent (see other replies) trying to show
that 'git add <sparse-path>' is still modifying index entries. I finally
added enough checks that this fails in my merge/cherry-pick/rebase tests
for conflicts outside of the sparse-checkout cone. Here is a test that I
can add to t3705 to get a failure:

test_expect_success 'git add fails outside of sparse-checkout definition' '
	test_when_finished git sparse-checkout disable &&
	test_commit a &&
	git sparse-checkout init &&
	git sparse-checkout set a &&

	git update-index --no-skip-worktree sparse_entry &&
	test_must_fail git add sparse_entry &&
	test_sparse_entry_unstaged
'
 
> diff --git a/diff-lib.c b/diff-lib.c
> index f9eadc4fc1..4245d7ead5 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -198,7 +198,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  				continue;
>  		}
>  
> -		if (ce_uptodate(ce) || ce_skip_worktree(ce))
> +		if (ce_uptodate(ce) || ce_skip_worktree(ce) ||
> +		    !path_in_sparse_checkout(ce->name, istate))
>  			continue;
>  
>  		/*
> 
> Or we could change the callback in add itself:
> 
> diff --git a/builtin/add.c b/builtin/add.c
> index f675bdeae4..3d7762aac2 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -94,6 +94,10 @@ static void update_callback(struct diff_queue_struct *q,
>  	for (i = 0; i < q->nr; i++) {
>  		struct diff_filepair *p = q->queue[i];
>  		const char *path = p->one->path;
> +
> +		if (!path_in_sparse_checkout(path, &the_index))
> +			continue;
> +
>  		switch (fix_unmerged_status(p, data)) {
>  		default:
>  			die(_("unexpected diff status %c"), p->status);

If I use this second callback, then I have access to 'include_sparse'
in the later change that adds the --sparse option.

> I believe we also need to update a few other places to use the
> `(ce_skip_worktree(ce) || !path_in_sparse_checkout())` logic in order to
> avoid updating tracked sparse paths: chmod_pathspec() for add's --chmod
> option, renormalize_tracked_files() for --renormalize, and
> read-cache.c:refresh_index() for --refresh.

OK, I'll make sure to include and test these cases.
 
>>                         continue;
>>                 ce_path_match(istate, ce, pathspec, seen);
>>         }
> 
> Hmm, don't we also want to update
> find_pathspecs_matching_skip_worktree() in this patch to use
> path_in_sparse_checkout()? I see you did that in patch 8, but I think
> this should be together with this current patch as, without it, we stop
> adding tracked sparse paths but we print no error/advice message about
> it.

I'll merge the patches to avoid confusion.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/pathspec.c b/pathspec.c
index 44306fdaca2..0e6e60fdc5a 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -39,7 +39,8 @@  void add_pathspec_matches_against_index(const struct pathspec *pathspec,
 		return;
 	for (i = 0; i < istate->cache_nr; i++) {
 		const struct cache_entry *ce = istate->cache[i];
-		if (sw_action == PS_IGNORE_SKIP_WORKTREE && ce_skip_worktree(ce))
+		if (sw_action == PS_IGNORE_SKIP_WORKTREE &&
+		    (ce_skip_worktree(ce) || !path_in_sparse_checkout(ce->name, istate)))
 			continue;
 		ce_path_match(istate, ce, pathspec, seen);
 	}
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 962bece03e1..c2a4eec548d 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -529,7 +529,7 @@  test_expect_failure 'merge with conflict outside cone' '
 	# NEEDSWORK: Even though the merge conflict removed the
 	# SKIP_WORKTREE bit from the index entry for folder1/a, we should
 	# warn that this is a problematic add.
-	test_all_match git add folder1/a &&
+	test_sparse_match test_must_fail git add folder1/a &&
 	test_all_match git status --porcelain=v2 &&
 
 	# 3. Rename the file to another sparse filename and
@@ -538,7 +538,7 @@  test_expect_failure 'merge with conflict outside cone' '
 	# NEEDSWORK: This mode now fails, because folder2/z is
 	# outside of the sparse-checkout cone and does not match an
 	# existing index entry with the SKIP_WORKTREE bit cleared.
-	test_all_match git add folder2 &&
+	test_sparse_match test_must_fail git add folder2 &&
 	test_all_match git status --porcelain=v2 &&
 
 	test_all_match git merge --continue &&
@@ -566,7 +566,7 @@  test_expect_failure 'cherry-pick/rebase with conflict outside cone' '
 		# NEEDSWORK: Even though the merge conflict removed the
 		# SKIP_WORKTREE bit from the index entry for folder1/a, we should
 		# warn that this is a problematic add.
-		test_all_match git add folder1/a &&
+		test_sparse_match test_must_fail git add folder1/a &&
 		test_all_match git status --porcelain=v2 &&
 
 		# 3. Rename the file to another sparse filename and
@@ -575,7 +575,7 @@  test_expect_failure 'cherry-pick/rebase with conflict outside cone' '
 		# outside of the sparse-checkout cone and does not match an
 		# existing index entry with the SKIP_WORKTREE bit cleared.
 		run_on_all mv folder2/a folder2/z &&
-		test_all_match git add folder2 &&
+		test_sparse_match test_must_fail git add folder2 &&
 		test_all_match git status --porcelain=v2 &&
 
 		test_all_match git $OPERATION --continue &&