Message ID | 371985352680a767dfacb5477aa77e92e04008ee.1628625013.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Sparse index: delete ignored files outside sparse cone | expand |
On Tue, Aug 10, 2021 at 12:50 PM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > > From: Derrick Stolee <dstolee@microsoft.com> > > If cache_tree_update() returns a non-zero value, then it could not > create the cache tree. This is likely due to a path having a merge > conflict. Since we are already returning early, let's return silently to > avoid making it seem like we failed to write the index at all. > > If we remove our dependence on the cache tree within > convert_to_sparse(), then we could still recover from this scenario and > have a sparse index. > > When constructing the cache-tree extension in convert_to_sparse(), it is > possible that we construct a tree object that is new to the object > database. Without the WRITE_TREE_MISSING_OK flag, this results in an > error that halts our conversion to a sparse index. Add this flag to > remove this limitation. Would this only happen when the user has staged but uncommitted changes outside the sparsity paths, and tries to sparsify while in that state? (Notably, this is a much different condition than the above mentioned merge conflict case that would case cache_tree_update() to just fail.) I think it might be nicer to split this commit in two, just to make it easier to understand for future readers. This seems like two logical changes and trying to understand them and why would I think be easier if the two were split. I'd be tempted to put the WRITE_TREE_MISSING_OK first. > Signed-off-by: Derrick Stolee <dstolee@microsoft.com> > --- > sparse-index.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/sparse-index.c b/sparse-index.c > index bc5900eae35..b6e90417556 100644 > --- a/sparse-index.c > +++ b/sparse-index.c > @@ -179,10 +179,15 @@ int convert_to_sparse(struct index_state *istate) > > /* Clear and recompute the cache-tree */ > cache_tree_free(&istate->cache_tree); > - if (cache_tree_update(istate, 0)) { > - warning(_("unable to update cache-tree, staying full")); > - return -1; > - } > + /* > + * Silently return if there is a problem with the cache tree update, > + * which might just be due to a conflict state in some entry. > + * > + * This might create new tree objects, so be sure to use > + * WRITE_TREE_MISSING_OK. > + */ > + if (cache_tree_update(istate, WRITE_TREE_MISSING_OK)) > + return 0; > > remove_fsmonitor(istate); > > -- > gitgitgadget These feel like cases where it would be nice to have a testcase demonstrating the change in behavior. Perhaps just splitting the commit would be enough, but it took a bit of time to try to understand what would change and why, despite the simple changes.
On 8/19/2021 2:24 PM, Elijah Newren wrote: > On Tue, Aug 10, 2021 at 12:50 PM Derrick Stolee via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> >> From: Derrick Stolee <dstolee@microsoft.com> >> >> If cache_tree_update() returns a non-zero value, then it could not >> create the cache tree. This is likely due to a path having a merge >> conflict. Since we are already returning early, let's return silently to >> avoid making it seem like we failed to write the index at all. >> >> If we remove our dependence on the cache tree within >> convert_to_sparse(), then we could still recover from this scenario and >> have a sparse index. >> >> When constructing the cache-tree extension in convert_to_sparse(), it is >> possible that we construct a tree object that is new to the object >> database. Without the WRITE_TREE_MISSING_OK flag, this results in an >> error that halts our conversion to a sparse index. Add this flag to >> remove this limitation. > > Would this only happen when the user has staged but uncommitted > changes outside the sparsity paths, and tries to sparsify while in > that state? (Notably, this is a much different condition than the > above mentioned merge conflict case that would case > cache_tree_update() to just fail.) > > I think it might be nicer to split this commit in two, just to make it > easier to understand for future readers. This seems like two logical > changes and trying to understand them and why would I think be easier > if the two were split. I'd be tempted to put the > WRITE_TREE_MISSING_OK first. Ironically, I _had_ this as two commits because I discovered the problems independently. It wasn't until I was organizing things that I realized I was editing the same 'if' twice and thought it better to merge patches. But I also don't feel strongly about that, so I can split them. >> + /* >> + * Silently return if there is a problem with the cache tree update, >> + * which might just be due to a conflict state in some entry. >> + * >> + * This might create new tree objects, so be sure to use >> + * WRITE_TREE_MISSING_OK. >> + */ >> + if (cache_tree_update(istate, WRITE_TREE_MISSING_OK)) >> + return 0; ... > These feel like cases where it would be nice to have a testcase > demonstrating the change in behavior. Perhaps just splitting the > commit would be enough, but it took a bit of time to try to understand > what would change and why, despite the simple changes. I found these were required in the Scalar functional tests, so I bet that if I remove this change I can create a test case from that. Thanks. -Stolee
diff --git a/sparse-index.c b/sparse-index.c index bc5900eae35..b6e90417556 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -179,10 +179,15 @@ int convert_to_sparse(struct index_state *istate) /* Clear and recompute the cache-tree */ cache_tree_free(&istate->cache_tree); - if (cache_tree_update(istate, 0)) { - warning(_("unable to update cache-tree, staying full")); - return -1; - } + /* + * Silently return if there is a problem with the cache tree update, + * which might just be due to a conflict state in some entry. + * + * This might create new tree objects, so be sure to use + * WRITE_TREE_MISSING_OK. + */ + if (cache_tree_update(istate, WRITE_TREE_MISSING_OK)) + return 0; remove_fsmonitor(istate);