Message ID | 068861632b85179d2a5a5ceb966e951a78b27141.1553895166.git.jonathantanmy@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Batch fetching of missing blobs in diff and show | expand |
Hi Jonathan, On Fri, 29 Mar 2019, Jonathan Tan wrote: > Teach oid_object_info_extended() to support a new flag that inhibits > fetching of missing objects. This is equivalent to setting > fetch_is_missing to 0, calling oid_object_info_extended(), then setting > fetch_if_missing to whatever it was before. Update unpack-trees.c to use > this new flag instead of repeatedly setting fetch_if_missing. > > This new flag complicates things slightly in that there are now 2 ways > to do the same thing. Just a note that I disagree with the latter part of the sentence: those are not 2 ways of doing the same thing, but they are two switches that essentially both have to be flipped to "on". They're just multiple gates. I do not ask you to rephrase it, merely registering a different opinion. The patch looks good, I especially like the post-image of `check_updates()`, which looks much nicer (from my perspective, of course). Thanks, Dscho
On Fri, Mar 29, 2019 at 02:39:27PM -0700, Jonathan Tan wrote: > Teach oid_object_info_extended() to support a new flag that inhibits > fetching of missing objects. This is equivalent to setting > fetch_is_missing to 0, calling oid_object_info_extended(), then setting > fetch_if_missing to whatever it was before. Update unpack-trees.c to use > this new flag instead of repeatedly setting fetch_if_missing. > > This new flag complicates things slightly in that there are now 2 ways > to do the same thing. But this eliminates the need to repeatedly set a > global variable, and more importantly, allows prefetching to be done in > parallel (in the future); hence, this patch. Sorry I'm a little late to review this. I don't have any critical comments, so if this gets ignored, I'll live with it. > +/* > + * Do not attempt to fetch the object if missing (even if fetch_is_missing is > + * nonzero). This is meant for bulk prefetching of missing blobs in a partial > + * clone. Implies OBJECT_INFO_QUICK. > + */ > +#define OBJECT_INFO_FOR_PREFETCH (32 + OBJECT_INFO_QUICK) Mostly I found the name and semantics of this flag to be a little confusing. Really what we want is to tell oid_object_info() not do any on-demand fetching for us. That seems like a thing that we might eventually want for other purposes (e.g., a diff operation that could produce a real blob diff but would be happy outputting a less-detailed tree diff). If it were just OBJECT_INFO_NO_FETCH or similar, that tells more clearly what it does, and would make sense in more contexts. I suspect that QUICK would be the norm when used with it, though I probably would have kept the two orthogonal for the sake of simplicity and clarity. > diff --git a/unpack-trees.c b/unpack-trees.c > index 22c41a3ba8..381b0cd65e 100644 > --- a/unpack-trees.c > +++ b/unpack-trees.c > @@ -404,20 +404,21 @@ static int check_updates(struct unpack_trees_options *o) > * below. > */ > struct oid_array to_fetch = OID_ARRAY_INIT; > - int fetch_if_missing_store = fetch_if_missing; > - fetch_if_missing = 0; > for (i = 0; i < index->cache_nr; i++) { > struct cache_entry *ce = index->cache[i]; > - if ((ce->ce_flags & CE_UPDATE) && > - !S_ISGITLINK(ce->ce_mode)) { > - if (!has_object_file(&ce->oid)) > - oid_array_append(&to_fetch, &ce->oid); > - } > + > + if (!(ce->ce_flags & CE_UPDATE) || > + S_ISGITLINK(ce->ce_mode)) > + continue; > + if (!oid_object_info_extended(the_repository, &ce->oid, > + NULL, > + OBJECT_INFO_FOR_PREFETCH)) > + continue; > + oid_array_append(&to_fetch, &ce->oid); Here we get rid of the global set/restore dance, which is nice. But there's also a behavior change, as we've picked up QUICK. I think that's probably the right thing to do, but I was a bit surprised not to see any discussion in the commit message. -Peff
diff --git a/object-store.h b/object-store.h index 14fc935bd1..dd3f9b75f0 100644 --- a/object-store.h +++ b/object-store.h @@ -280,6 +280,12 @@ struct object_info { #define OBJECT_INFO_QUICK 8 /* Do not check loose object */ #define OBJECT_INFO_IGNORE_LOOSE 16 +/* + * Do not attempt to fetch the object if missing (even if fetch_is_missing is + * nonzero). This is meant for bulk prefetching of missing blobs in a partial + * clone. Implies OBJECT_INFO_QUICK. + */ +#define OBJECT_INFO_FOR_PREFETCH (32 + OBJECT_INFO_QUICK) int oid_object_info_extended(struct repository *r, const struct object_id *, diff --git a/sha1-file.c b/sha1-file.c index 494606f771..ad02649124 100644 --- a/sha1-file.c +++ b/sha1-file.c @@ -1370,7 +1370,8 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid, /* Check if it is a missing object */ if (fetch_if_missing && repository_format_partial_clone && - !already_retried && r == the_repository) { + !already_retried && r == the_repository && + !(flags & OBJECT_INFO_FOR_PREFETCH)) { /* * TODO Investigate having fetch_object() return * TODO error/success and stopping the music here. diff --git a/unpack-trees.c b/unpack-trees.c index 22c41a3ba8..381b0cd65e 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -404,20 +404,21 @@ static int check_updates(struct unpack_trees_options *o) * below. */ struct oid_array to_fetch = OID_ARRAY_INIT; - int fetch_if_missing_store = fetch_if_missing; - fetch_if_missing = 0; for (i = 0; i < index->cache_nr; i++) { struct cache_entry *ce = index->cache[i]; - if ((ce->ce_flags & CE_UPDATE) && - !S_ISGITLINK(ce->ce_mode)) { - if (!has_object_file(&ce->oid)) - oid_array_append(&to_fetch, &ce->oid); - } + + if (!(ce->ce_flags & CE_UPDATE) || + S_ISGITLINK(ce->ce_mode)) + continue; + if (!oid_object_info_extended(the_repository, &ce->oid, + NULL, + OBJECT_INFO_FOR_PREFETCH)) + continue; + oid_array_append(&to_fetch, &ce->oid); } if (to_fetch.nr) fetch_objects(repository_format_partial_clone, to_fetch.oid, to_fetch.nr); - fetch_if_missing = fetch_if_missing_store; oid_array_clear(&to_fetch); } for (i = 0; i < index->cache_nr; i++) {
Teach oid_object_info_extended() to support a new flag that inhibits fetching of missing objects. This is equivalent to setting fetch_is_missing to 0, calling oid_object_info_extended(), then setting fetch_if_missing to whatever it was before. Update unpack-trees.c to use this new flag instead of repeatedly setting fetch_if_missing. This new flag complicates things slightly in that there are now 2 ways to do the same thing. But this eliminates the need to repeatedly set a global variable, and more importantly, allows prefetching to be done in parallel (in the future); hence, this patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- object-store.h | 6 ++++++ sha1-file.c | 3 ++- unpack-trees.c | 17 +++++++++-------- 3 files changed, 17 insertions(+), 9 deletions(-)