Message ID | a32e3d6146dd41af36f525a744d6cc099b42d6fb.1666967670.git.ps@pks.im (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | receive-pack: use advertised reference tips to inform connectivity check | expand |
On Fri, Oct 28 2022, Patrick Steinhardt wrote: > @@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data, > > rev_list_in = xfdopen(rev_list.in, "w"); > > + if (opt->reachable_oids_fn) { > + const struct object_id *reachable_oid; > + while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL) > + if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0) > + break; Just a style nit, we tend to avoid != NULL, != 0 etc. comparisons. I see connected.c has some of that already, but for new code let's just check truthiness. Also for such a small scope a shorter variable name helps us stay at the usual column limits: if (opt->reachable_oids_fn) { const struct object_id *oid; while ((oid = opt->reachable_oids_fn(opt->reachable_oids_data))) if (fprintf(rev_list_in, "^%s\n", oid_to_hex(oid)) < 0) break; The fprintf() return value checking seemed a bit odd, not because we shouldn't do it, but because we usually don't bother. For other reviewers: We have that form already in connected.c, so at least locally we're not being diligently careful, only to have it undone by adjacent code... Looks good!
Patrick Steinhardt <ps@pks.im> writes: > diff --git a/connected.c b/connected.c > index 74a20cb32e..2a4c4e0025 100644 > --- a/connected.c > +++ b/connected.c > @@ -98,7 +98,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data, > strvec_push(&rev_list.args, "--stdin"); > if (has_promisor_remote()) > strvec_push(&rev_list.args, "--exclude-promisor-objects"); > - if (!opt->is_deepening_fetch) { > + if (!opt->is_deepening_fetch && !opt->reachable_oids_fn) { > strvec_push(&rev_list.args, "--not"); > strvec_push(&rev_list.args, "--all"); > } > @@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data, > > rev_list_in = xfdopen(rev_list.in, "w"); > > + if (opt->reachable_oids_fn) { > + const struct object_id *reachable_oid; > + while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL) > + if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0) > + break; > + } It is good that these individual negative references are fed from the standard input, not on the command line, as they can be many. In the original code without the reachable_oids_fn, we refrain from excluding when the is_deepening_fetch bit is set, but here we do not pay attention to the bit at all. Is that sensible, and if so why?
On Fri, Oct 28, 2022 at 11:12:33AM -0700, Junio C Hamano wrote: > In the original code without the reachable_oids_fn, we refrain from > excluding when the is_deepening_fetch bit is set, but here we do not > pay attention to the bit at all. Is that sensible, and if so why? I was wondering the same thing. Thanks for asking. Thanks, Taylor
On Fri, Oct 28, 2022 at 11:12:33AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > diff --git a/connected.c b/connected.c > > index 74a20cb32e..2a4c4e0025 100644 > > --- a/connected.c > > +++ b/connected.c > > @@ -98,7 +98,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data, > > strvec_push(&rev_list.args, "--stdin"); > > if (has_promisor_remote()) > > strvec_push(&rev_list.args, "--exclude-promisor-objects"); > > - if (!opt->is_deepening_fetch) { > > + if (!opt->is_deepening_fetch && !opt->reachable_oids_fn) { > > strvec_push(&rev_list.args, "--not"); > > strvec_push(&rev_list.args, "--all"); > > } > > @@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data, > > > > rev_list_in = xfdopen(rev_list.in, "w"); > > > > + if (opt->reachable_oids_fn) { > > + const struct object_id *reachable_oid; > > + while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL) > > + if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0) > > + break; > > + } > > It is good that these individual negative references are fed from > the standard input, not on the command line, as they can be many. > > In the original code without the reachable_oids_fn, we refrain from > excluding when the is_deepening_fetch bit is set, but here we do not > pay attention to the bit at all. Is that sensible, and if so why? Hm, good point. On a deepening fetch the commits that were the previous boundary will likely get replaced by new commits that are at a deeper point in history, so they cannot be used as a well-defined boundary. Instead, we do a complete graph-walk that doesn't stop at any previously known commits at all. At least that's how I understand the code, the explanation is likely a bit fuzzy. I guess we should thus also pay attention to `is_deepening_fetch` here. As this means that `is_deepening_fetch` and `reachable_oids_fn` are mutually exclusive I'm inclined to go even further and `die()` if both are set at the same time. We only adapt git-receive-pack(1) anyway, so we should never run into this situation for now. Patrick
On Mon, Oct 31, 2022 at 02:10:16PM +0100, Patrick Steinhardt wrote: > I guess we should thus also pay attention to `is_deepening_fetch` here. > As this means that `is_deepening_fetch` and `reachable_oids_fn` are > mutually exclusive I'm inclined to go even further and `die()` if both > are set at the same time. We only adapt git-receive-pack(1) anyway, so > we should never run into this situation for now. Yes, I agree. Thanks, Taylor
diff --git a/connected.c b/connected.c index 74a20cb32e..2a4c4e0025 100644 --- a/connected.c +++ b/connected.c @@ -98,7 +98,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data, strvec_push(&rev_list.args, "--stdin"); if (has_promisor_remote()) strvec_push(&rev_list.args, "--exclude-promisor-objects"); - if (!opt->is_deepening_fetch) { + if (!opt->is_deepening_fetch && !opt->reachable_oids_fn) { strvec_push(&rev_list.args, "--not"); strvec_push(&rev_list.args, "--all"); } @@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data, rev_list_in = xfdopen(rev_list.in, "w"); + if (opt->reachable_oids_fn) { + const struct object_id *reachable_oid; + while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL) + if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0) + break; + } + do { /* * If index-pack already checked that: diff --git a/connected.h b/connected.h index 6e59c92aa3..f09c7d7884 100644 --- a/connected.h +++ b/connected.h @@ -46,6 +46,13 @@ struct check_connected_options { * during a fetch. */ unsigned is_deepening_fetch : 1; + + /* + * If non-NULL, use this iterator to determine the set of reachable + * objects instead of marking all references as unreachable. + */ + oid_iterate_fn reachable_oids_fn; + void *reachable_oids_data; }; #define CHECK_CONNECTED_INIT { 0 }
The connectivity check is executed via git-receive-pack(1) to verify that a client has provided all references that are required to satisfy a set of reference updates. What the connectivity check does is to walk the object graph with all reference tips as starting points while all preexisting reference tips are marked as uninteresting. Preexisting references are currently marked uninteresting by passing `--not --all` to git-rev-list(1). Some users of the connectivity check may have a better picture of which objects should be regarded as uninteresting though, e.g. by reusing information from the reference advertisement when serving a push. Add a new field to `struct check_connected_options` that allows callers to replace the `--not --all` logic with their own set of object IDs they regard as uninteresting. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- connected.c | 9 ++++++++- connected.h | 7 +++++++ 2 files changed, 15 insertions(+), 1 deletion(-)