Message ID | 399ddb0bad56c69ff9d9591f5e8eacf52cf50a15.1615404665.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Sparse Index: Design, Format, Tests | expand |
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > void ensure_full_index(struct index_state *istate) > { > ... > + int i; > + tree = lookup_tree(istate->repo, &ce->oid); > + > + memset(&ps, 0, sizeof(ps)); > + ps.recursive = 1; > + ps.has_wildcard = 1; > + ps.max_depth = -1; > + > + read_tree_recursive(istate->repo, tree, > + ce->name, strlen(ce->name), > + 0, &ps, > + add_path_to_index, full); Ævar, the assumption that led to your e68237bb (tree.h API: remove support for starting at prefix != "", 2021-03-08) closes the door for this code rather badly. Please work with Derrick to figure out what the best course of action would be. Thanks. > + /* free directory entries. full entries are re-used */ > + discard_cache_entry(ce); > + } > + > + /* Copy back into original index. */ > + memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash)); > + istate->sparse_index = 0; > + free(istate->cache); > + istate->cache = full->cache; > + istate->cache_nr = full->cache_nr; > + istate->cache_alloc = full->cache_alloc; > + > + free(full); > + > + trace2_region_leave("index", "ensure_full_index", istate->repo); > }
On 3/12/2021 1:50 AM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> void ensure_full_index(struct index_state *istate) >> { >> ... >> + int i; >> + tree = lookup_tree(istate->repo, &ce->oid); >> + >> + memset(&ps, 0, sizeof(ps)); >> + ps.recursive = 1; >> + ps.has_wildcard = 1; >> + ps.max_depth = -1; >> + >> + read_tree_recursive(istate->repo, tree, >> + ce->name, strlen(ce->name), >> + 0, &ps, >> + add_path_to_index, full); > > Ævar, the assumption that led to your e68237bb (tree.h API: remove > support for starting at prefix != "", 2021-03-08) closes the door > for this code rather badly. Please work with Derrick to figure out > what the best course of action would be. Thanks for pointing this out, Junio. My preference would be to drop "tree.h API: remove support for starting at prefix != """, but it should be OK to keep "tree.h API: remove "stage" parameter from read_tree_recursive()" (currently b3a078863f6), even though it introduces a semantic conflict here. Since I haven't seen my sparse-index topic get picked up by a tracking branch, I'd be happy to rebase on top of Ævar's topic if I can still set a non-root prefix. Thanks, -Stolee
Derrick Stolee <stolee@gmail.com> writes: >> Ævar, the assumption that led to your e68237bb (tree.h API: remove >> support for starting at prefix != "", 2021-03-08) closes the door >> for this code rather badly. Please work with Derrick to figure out >> what the best course of action would be. > > Thanks for pointing this out, Junio. > > My preference would be to drop "tree.h API: remove support for > starting at prefix != """, but it should be OK to keep "tree.h API: > remove "stage" parameter from read_tree_recursive()" (currently > b3a078863f6), even though it introduces a semantic conflict here. > > Since I haven't seen my sparse-index topic get picked up by a > tracking branch, I'd be happy to rebase on top of Ævar's topic if > I can still set a non-root prefix. I did try to have both in 'seen' (after all, that is the primary way I find out these conflicts early---no one can keep all the details of all the topics in flight in one's head), and saw that we now have a need for non-empty prefix that we thought we no longer have in the other topic --- I think we should probably keep support of non-empty prefix (as the primary reason why that patch exists is because we saw no in-tree users---now if your 05/20 proves to be a good use of the feature, there is one fewer reasons to remove the support) in some form, so discarding e68237bb certainly is an option. If we were to base the sparse-index topic on top of ab/read-tree, we may be able to gain further simplification and clean-up of the API. I think all the clean-up value e68237bb has are on the calling side (they no longer have to pass constant ("", 0) to the function), and we could rewrite e68237bb by - renaming "read_tree_recursive()" to "read_tree_at()", with the non-empty prefix support. - creating a new function "read_tree()", which lacks the support for prefix, as a thin-wrapper around "read_tree_at()". - modifying the callers of "read_tree_recursive()" changed by e68237bb to instead call "read_tree()" (without prefix). to simplify majority of calling sites without losing functionality. Then your [05/20] can use the read_tree_at() to read with a prefix. But that kind of details, I'd want to see you two figure out yourselves. Thanks.
On 3/12/2021 3:08 PM, Junio C Hamano wrote: > Derrick Stolee <stolee@gmail.com> writes: > >>> Ævar, the assumption that led to your e68237bb (tree.h API: remove >>> support for starting at prefix != "", 2021-03-08) closes the door >>> for this code rather badly. Please work with Derrick to figure out >>> what the best course of action would be. >> >> Thanks for pointing this out, Junio. >> >> My preference would be to drop "tree.h API: remove support for >> starting at prefix != """, but it should be OK to keep "tree.h API: >> remove "stage" parameter from read_tree_recursive()" (currently >> b3a078863f6), even though it introduces a semantic conflict here. >> >> Since I haven't seen my sparse-index topic get picked up by a >> tracking branch, I'd be happy to rebase on top of Ævar's topic if >> I can still set a non-root prefix. > I think all the clean-up value e68237bb has are on the calling side > (they no longer have to pass constant ("", 0) to the function), and > we could rewrite e68237bb by > > - renaming "read_tree_recursive()" to "read_tree_at()", with the > non-empty prefix support. > > - creating a new function "read_tree()", which lacks the support > for prefix, as a thin-wrapper around "read_tree_at()". > > - modifying the callers of "read_tree_recursive()" changed by > e68237bb to instead call "read_tree()" (without prefix). > > to simplify majority of calling sites without losing functionality. > > Then your [05/20] can use the read_tree_at() to read with a prefix. > > > But that kind of details, I'd want to see you two figure out > yourselves. You've given us a great proposal. I'll wait for Ævar to chime in (and probably update his topic) before I submit a new version. Thanks, -Stolee
On Fri, Mar 12 2021, Derrick Stolee wrote: > On 3/12/2021 3:08 PM, Junio C Hamano wrote: >> Derrick Stolee <stolee@gmail.com> writes: >> >>>> Ævar, the assumption that led to your e68237bb (tree.h API: remove >>>> support for starting at prefix != "", 2021-03-08) closes the door >>>> for this code rather badly. Please work with Derrick to figure out >>>> what the best course of action would be. >>> >>> Thanks for pointing this out, Junio. >>> >>> My preference would be to drop "tree.h API: remove support for >>> starting at prefix != """, but it should be OK to keep "tree.h API: >>> remove "stage" parameter from read_tree_recursive()" (currently >>> b3a078863f6), even though it introduces a semantic conflict here. >>> >>> Since I haven't seen my sparse-index topic get picked up by a >>> tracking branch, I'd be happy to rebase on top of Ævar's topic if >>> I can still set a non-root prefix. >> I think all the clean-up value e68237bb has are on the calling side >> (they no longer have to pass constant ("", 0) to the function), and >> we could rewrite e68237bb by >> >> - renaming "read_tree_recursive()" to "read_tree_at()", with the >> non-empty prefix support. >> >> - creating a new function "read_tree()", which lacks the support >> for prefix, as a thin-wrapper around "read_tree_at()". >> >> - modifying the callers of "read_tree_recursive()" changed by >> e68237bb to instead call "read_tree()" (without prefix). >> >> to simplify majority of calling sites without losing functionality. >> >> Then your [05/20] can use the read_tree_at() to read with a prefix. >> >> >> But that kind of details, I'd want to see you two figure out >> yourselves. > > You've given us a great proposal. I'll wait for Ævar to chime in > (and probably update his topic) before I submit a new version. I've re-rolled my series just now at https://lore.kernel.org/git/20210315234344.28427-1-avarab@gmail.com/ sorry for the delay. You should be able to rebase easily on top of it, although note that the new read_tree_at() uses a strbuf, but is otherwise the same as the old read_tree_recursive(). Note that the pathspec can also be used to get to where read_tree_recursive() would have brought you. I haven't looked at whether there's reasons to convert in-tree (or this) code to pathspec use, or vice-versa convert some things that use pathspecs now (e.g. ls-tree with a path) to providing a prefix via the strbuf.
diff --git a/cache.h b/cache.h index d92814961405..1f0b42264606 100644 --- a/cache.h +++ b/cache.h @@ -204,6 +204,8 @@ struct cache_entry { #error "CE_EXTENDED_FLAGS out of range" #endif +#define S_ISSPARSEDIR(m) ((m) == S_IFDIR) + /* Forward structure decls */ struct pathspec; struct child_process; @@ -319,7 +321,14 @@ struct index_state { drop_cache_tree : 1, updated_workdir : 1, updated_skipworktree : 1, - fsmonitor_has_run_once : 1; + fsmonitor_has_run_once : 1, + + /* + * sparse_index == 1 when sparse-directory + * entries exist. Requires sparse-checkout + * in cone mode. + */ + sparse_index : 1; struct hashmap name_hash; struct hashmap dir_hash; struct object_id oid; @@ -722,6 +731,8 @@ int read_index_from(struct index_state *, const char *path, const char *gitdir); int is_index_unborn(struct index_state *); +void ensure_full_index(struct index_state *istate); + /* For use with `write_locked_index()`. */ #define COMMIT_LOCK (1 << 0) #define SKIP_IF_UNCHANGED (1 << 1) diff --git a/read-cache.c b/read-cache.c index 29144cf879e7..97dbf2434f30 100644 --- a/read-cache.c +++ b/read-cache.c @@ -101,6 +101,9 @@ static const char *alternate_index_output; static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { + if (S_ISSPARSEDIR(ce->ce_mode)) + istate->sparse_index = 1; + istate->cache[nr] = ce; add_name_hash(istate, ce); } @@ -2255,6 +2258,12 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) trace2_data_intmax("index", the_repository, "read/cache_nr", istate->cache_nr); + if (!istate->repo) + istate->repo = the_repository; + prepare_repo_settings(istate->repo); + if (istate->repo->settings.command_requires_full_index) + ensure_full_index(istate); + return istate->cache_nr; unmap: diff --git a/sparse-index.c b/sparse-index.c index 82183ead563b..316cb949b74b 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -1,8 +1,101 @@ #include "cache.h" #include "repository.h" #include "sparse-index.h" +#include "tree.h" +#include "pathspec.h" +#include "trace2.h" + +static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) +{ + ALLOC_GROW(istate->cache, nr + 1, istate->cache_alloc); + + istate->cache[nr] = ce; + add_name_hash(istate, ce); +} + +static int add_path_to_index(const struct object_id *oid, + struct strbuf *base, const char *path, + unsigned int mode, int stage, void *context) +{ + struct index_state *istate = (struct index_state *)context; + struct cache_entry *ce; + size_t len = base->len; + + if (S_ISDIR(mode)) + return READ_TREE_RECURSIVE; + + strbuf_addstr(base, path); + + ce = make_cache_entry(istate, mode, oid, base->buf, 0, 0); + ce->ce_flags |= CE_SKIP_WORKTREE; + set_index_entry(istate, istate->cache_nr++, ce); + + strbuf_setlen(base, len); + return 0; +} void ensure_full_index(struct index_state *istate) { - /* intentionally left blank */ + int i; + struct index_state *full; + + if (!istate || !istate->sparse_index) + return; + + if (!istate->repo) + istate->repo = the_repository; + + trace2_region_enter("index", "ensure_full_index", istate->repo); + + /* initialize basics of new index */ + full = xcalloc(1, sizeof(struct index_state)); + memcpy(full, istate, sizeof(struct index_state)); + + /* then change the necessary things */ + full->sparse_index = 0; + full->cache_alloc = (3 * istate->cache_alloc) / 2; + full->cache_nr = 0; + ALLOC_ARRAY(full->cache, full->cache_alloc); + + for (i = 0; i < istate->cache_nr; i++) { + struct cache_entry *ce = istate->cache[i]; + struct tree *tree; + struct pathspec ps; + + if (!S_ISSPARSEDIR(ce->ce_mode)) { + set_index_entry(full, full->cache_nr++, ce); + continue; + } + if (!(ce->ce_flags & CE_SKIP_WORKTREE)) + warning(_("index entry is a directory, but not sparse (%08x)"), + ce->ce_flags); + + /* recursively walk into cd->name */ + tree = lookup_tree(istate->repo, &ce->oid); + + memset(&ps, 0, sizeof(ps)); + ps.recursive = 1; + ps.has_wildcard = 1; + ps.max_depth = -1; + + read_tree_recursive(istate->repo, tree, + ce->name, strlen(ce->name), + 0, &ps, + add_path_to_index, full); + + /* free directory entries. full entries are re-used */ + discard_cache_entry(ce); + } + + /* Copy back into original index. */ + memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash)); + istate->sparse_index = 0; + free(istate->cache); + istate->cache = full->cache; + istate->cache_nr = full->cache_nr; + istate->cache_alloc = full->cache_alloc; + + free(full); + + trace2_region_leave("index", "ensure_full_index", istate->repo); }