Message ID | 401227c2220b6b45d80e21b52e29b6821ca139f9.1596590295.git.jonathantanmy@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Lazy fetch with subprocess | expand |
Jonathan Tan <jonathantanmy@google.com> writes: > In a subsequent patch, a null fetch negotiator will be introduced. This > negotiator, among other things, will not need any information about > common objects and will have a NULL known_common. Teach fetch-pack to > allow this. Hmph, both the default and the skipping negotiator seem to put NULL in known_common and add_tip when its next() method is called. Also they clear known_common to NULL after add_tip is called even once. So, how have we survived so far without this patch to "allow this (i.e. known_common method to be NULL)"? Is there something that makes sure a negotiator never gets called from this function after its .next or .add_tip method is called? Puzzled. Or is this merely an optimization? If so, it's not like the change "allows this", but it starts to take advantage of it in some way. ... goes and looks at mark_complete_and_common_ref() The function seems to have an unconditional call to ->known_common(), so anybody passing a negotiator whose known_common is NULL would already be segfaulting, so this does not appear to be an optimization but necessary to keep the code from crashing. I cannot quite tell if it is avoiding unnecessary work, or sweeping crashes under the rug, though. Is the untold assumption that mark_complete_and_common_ref() will never be called after either mark_tips() or find_common() have been called? Thanks. > [NEEDSWORK] > Optimizing out the ref iteration also affects the execution > of everything_local(), which relies on COMPLETE being set. (Having said > that, the typical use case - lazy fetching - would be fine with > everything_local() always returning that not everything is local.) > > This optimization is needed so that in the future, fetch_pack() can be > used to lazily fetch in a partial clone (without the no_dependents > flag). This means that fetch_pack() needs a way to execute without > relying on any targets of refs being present, and thus it cannot use the > ref iterator (because it checks and lazy-fetches any missing targets). > (Git currently does not have this problem because we use the > no_dependents flag, but lazy-fetching will in a subsequent patch be > changed to use the user-facing fetch command, which does not use this > flag.) > [/NEEDSWORK] > > Signed-off-by: Jonathan Tan <jonathantanmy@google.com> > --- > fetch-pack.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fetch-pack.c b/fetch-pack.c > index 6c786f5970..5f5474dbed 100644 > --- a/fetch-pack.c > +++ b/fetch-pack.c > @@ -677,6 +677,9 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, > int old_save_commit_buffer = save_commit_buffer; > timestamp_t cutoff = 0; > > + if (!negotiator->known_common) > + return; > + > save_commit_buffer = 0; > > trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
Junio C Hamano <gitster@pobox.com> writes: > Jonathan Tan <jonathantanmy@google.com> writes: > >> In a subsequent patch, a null fetch negotiator will be introduced. This >> negotiator, among other things, will not need any information about >> common objects and will have a NULL known_common. Teach fetch-pack to >> allow this. > > Hmph, both the default and the skipping negotiator seem to put NULL > in known_common and add_tip when its next() method is called. Also > they clear known_common to NULL after add_tip is called even once. > > So, how have we survived so far without this patch to "allow this > (i.e. known_common method to be NULL)"? Is there something that > makes sure a negotiator never gets called from this function after > its .next or .add_tip method is called? > > Puzzled. Or is this merely an optimization? If so, it's not like > the change "allows this", but it starts to take advantage of it in > some way. > > ... goes and looks at mark_complete_and_common_ref() > > The function seems to have an unconditional call to ->known_common(), > so anybody passing a negotiator whose known_common is NULL would > already be segfaulting, so this does not appear to be an optimization > but necessary to keep the code from crashing. I cannot quite tell > if it is avoiding unnecessary work, or sweeping crashes under the > rug, though. > > Is the untold assumption that mark_complete_and_common_ref() will > never be called after either mark_tips() or find_common() have been > called? Shot in the dark. Perhaps clearing of .add_tip and .known_common in the .next method was done to catch a wrong calling sequence where mark_complete_and_common_ref() gets called after mark_tips() and/or find_common() have by forcing the code to segfault? If so, this patch removes the safety and we may want to add an equivalent safety logic. Perhaps by adding a state field in the negotiator instance to record that mark_tips() and/or find_common() have been used and call a BUG() if mark_complete_and_common_ref() gets called after that, if enforcing such an invariant was the original reason why these fields were cleared.
> > Hmph, both the default and the skipping negotiator seem to put NULL > > in known_common and add_tip when its next() method is called. Also > > they clear known_common to NULL after add_tip is called even once. > > > > So, how have we survived so far without this patch to "allow this > > (i.e. known_common method to be NULL)"? Is there something that > > makes sure a negotiator never gets called from this function after > > its .next or .add_tip method is called? [snip] > > Is the untold assumption that mark_complete_and_common_ref() will > > never be called after either mark_tips() or find_common() have been > > called? > > Shot in the dark. Perhaps clearing of .add_tip and .known_common in > the .next method was done to catch a wrong calling sequence where > mark_complete_and_common_ref() gets called after mark_tips() and/or > find_common() have by forcing the code to segfault? Ah...yes, if I remember correctly, that was my original intention when I set them to NULL. > If so, this > patch removes the safety and we may want to add an equivalent safety > logic. Perhaps by adding a state field in the negotiator instance > to record that mark_tips() and/or find_common() have been used and > call a BUG() if mark_complete_and_common_ref() gets called after that, > if enforcing such an invariant was the original reason why these > fields were cleared. Sounds good. As I said in my reply to your query on patch 1, we might not need to set NULL anymore, but if we do, I'll do this.
diff --git a/fetch-pack.c b/fetch-pack.c index 6c786f5970..5f5474dbed 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -677,6 +677,9 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, int old_save_commit_buffer = save_commit_buffer; timestamp_t cutoff = 0; + if (!negotiator->known_common) + return; + save_commit_buffer = 0; trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
In a subsequent patch, a null fetch negotiator will be introduced. This negotiator, among other things, will not need any information about common objects and will have a NULL known_common. Teach fetch-pack to allow this. [NEEDSWORK] Optimizing out the ref iteration also affects the execution of everything_local(), which relies on COMPLETE being set. (Having said that, the typical use case - lazy fetching - would be fine with everything_local() always returning that not everything is local.) This optimization is needed so that in the future, fetch_pack() can be used to lazily fetch in a partial clone (without the no_dependents flag). This means that fetch_pack() needs a way to execute without relying on any targets of refs being present, and thus it cannot use the ref iterator (because it checks and lazy-fetches any missing targets). (Git currently does not have this problem because we use the no_dependents flag, but lazy-fetching will in a subsequent patch be changed to use the user-facing fetch command, which does not use this flag.) [/NEEDSWORK] Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- fetch-pack.c | 3 +++ 1 file changed, 3 insertions(+)