Message ID | 20200915095827.52047-2-hanxin.hx@alibaba-inc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] t5534: new test case for atomic signed push | expand |
Han Xin <chiyutianyi@gmail.com> writes: > Atomic push may be rejected, which makes it meanigless to generate push > cert first. Therefore, the push cert generation was moved after atomic > check. The overstatement "may be rejected" is probably a bit misleading the readers and reviewers. REF_STATUS_REJECT_NONFASTFORWARD may be observed by check_to_send_update() but the reason is set in set_ref_status_for_push(), which locally decides not to propose a no-ff ref update to the other side. At this point of the control flow, the other side hasn't have a chance to reject the push, because "we want you to update these refs to these new values" is yet to be sent (it is done after composing the push certificate). We may decide not to push (e.g. their ref may not fast forward to what we are pushing) at this point in the code. Checking the condition first will save us to ask GPG to sign the push certificate, since we will not send it to the other side. > - if (!args->dry_run) > - advertise_shallow_grafts_buf(&req_buf); Why should this be moved? It doesn't seem to have anything to do with the push certificate. > - > - if (!args->dry_run && push_cert_nonce) > - cmds_sent = generate_push_cert(&req_buf, remote_refs, args, > - cap_buf.buf, push_cert_nonce); > - > /* > * Clear the status for each ref and see if we need to send > * the pack data. This "Clear the status for each ref" worries me. The generate_push_cert() function RELIES on ref->status and filters out the ref that failed to pass the local check from the generated push certificate. If you let the loop (post context of this hunk) run, ref->status will be updated by it, so the net effect of this patch is that it breaks "non-atomic" case that pushes multiple refs and one of ref fails to pass the local check. IOW, generate_push_cert() MUST be called before this loop "clears the status for each ref" by assigning to ref->status. > @@ -489,6 +482,13 @@ int send_pack(struct send_pack_args *args, > ref->status = REF_STATUS_EXPECTING_REPORT; > } > > + if (!args->dry_run) > + advertise_shallow_grafts_buf(&req_buf); > + > + if (!args->dry_run && push_cert_nonce) > + cmds_sent = generate_push_cert(&req_buf, remote_refs, args, > + cap_buf.buf, push_cert_nonce); > + So, this change as-is is probably a bad idea. I wonder if generate_push_cert() can be told about atomicity of the push, though. There is this loop in the function: int update_seen = 0; ... for (ref = remote_refs; ref; ref = ref->next) { if (check_to_send_update(ref, args) < 0) continue; update_seen = 1; strbuf_addf(&cert, "%s %s %s\n", oid_to_hex(&ref->old_oid), oid_to_hex(&ref->new_oid), ref->name); } if (!update_seen) goto free_return; that makes it a no-op without invoking GPG if no update is needed. Perhaps we can extend it to int failure_seen = 0; int update_seen = 0; ... for (ref = remote_refs; ref; ref = ref->next) { switch (check_to_send_update(ref, args)) { case CHECK_REF_STATUS_REJECTED: failure_seen = 1; break; case 0: update_seen = 1; break; case REF_STATUS_UPTODATE: break; /* OK */ default: BUG("should not happen"); } strbuf_addf(&cert, "%s %s %s\n", oid_to_hex(&ref->old_oid), oid_to_hex(&ref->new_oid), ref->name); } if (!update_seen || (use_atomic && failure_seen)) goto free_return; to make it also a no-op when any local rejection under atomic mode? Thanks.
Junio C Hamano <gitster@pobox.com> writes: > I wonder if generate_push_cert() can be told about atomicity of the > push, though. There is this loop in the function: > > int update_seen = 0; > > ... > for (ref = remote_refs; ref; ref = ref->next) { > if (check_to_send_update(ref, args) < 0) > continue; > update_seen = 1; > strbuf_addf(&cert, "%s %s %s\n", > oid_to_hex(&ref->old_oid), > oid_to_hex(&ref->new_oid), > ref->name); > } > if (!update_seen) > goto free_return; > > that makes it a no-op without invoking GPG if no update is needed. > Perhaps we can extend it to > > int failure_seen = 0; > int update_seen = 0; > > ... > for (ref = remote_refs; ref; ref = ref->next) { > switch (check_to_send_update(ref, args)) { > case CHECK_REF_STATUS_REJECTED: > failure_seen = 1; > break; This "break" should be "continue" here. We want to exclude the ones that we are not going to send to the other side from the push certificate (in non-atomic case). > case 0: > update_seen = 1; > break; > case REF_STATUS_UPTODATE: > break; /* OK */ > default: > BUG("should not happen"); > } > strbuf_addf(&cert, "%s %s %s\n", > oid_to_hex(&ref->old_oid), > oid_to_hex(&ref->new_oid), > ref->name); > } > if (!update_seen || (use_atomic && failure_seen)) > goto free_return; > > to make it also a no-op when any local rejection under atomic mode? > > Thanks.
Junio C Hamano <gitster@pobox.com> 于2020年9月16日周三 上午5:08写道: > > Han Xin <chiyutianyi@gmail.com> writes: > > > Atomic push may be rejected, which makes it meanigless to generate push > > cert first. Therefore, the push cert generation was moved after atomic > > check. > > The overstatement "may be rejected" is probably a bit misleading the > readers and reviewers. REF_STATUS_REJECT_NONFASTFORWARD may be > observed by check_to_send_update() but the reason is set in > set_ref_status_for_push(), which locally decides not to propose a > no-ff ref update to the other side. At this point of the control > flow, the other side hasn't have a chance to reject the push, > because "we want you to update these refs to these new values" is > yet to be sent (it is done after composing the push certificate). > > We may decide not to push (e.g. their ref may not fast forward > to what we are pushing) at this point in the code. Checking the > condition first will save us to ask GPG to sign the push > certificate, since we will not send it to the other side. > It's always hard for a new contributor to write a decent commit log message. > > > - if (!args->dry_run) > > - advertise_shallow_grafts_buf(&req_buf); > > Why should this be moved? It doesn't seem to have anything to do > with the push certificate. > Checking the condition first will also save us to prepare shallow advertise. > > - > > - if (!args->dry_run && push_cert_nonce) > > - cmds_sent = generate_push_cert(&req_buf, remote_refs, args, > > - cap_buf.buf, push_cert_nonce); > > - > > /* > > * Clear the status for each ref and see if we need to send > > * the pack data. > > This "Clear the status for each ref" worries me. > > The generate_push_cert() function RELIES on ref->status and filters > out the ref that failed to pass the local check from the generated > push certificate. If you let the loop (post context of this hunk) > run, ref->status will be updated by it, so the net effect of this > patch is that it breaks "non-atomic" case that pushes multiple refs > and one of ref fails to pass the local check. > > IOW, generate_push_cert() MUST be called before this loop "clears > the status for each ref" by assigning to ref->status. > The next block ("Finally, tell the other end!") is what we send commands to "receive-pack", right after some of the status are reset to REF_STATUS_OK or REF_STATUS_EXPECTING_REPORT by this chunk of code. So moving the generate_push_cert() part right before the "Finally, tell the other end!" part LGTM.
Jiang Xin <worldhello.net@gmail.com> writes: >> > - >> > - if (!args->dry_run && push_cert_nonce) >> > - cmds_sent = generate_push_cert(&req_buf, remote_refs, args, >> > - cap_buf.buf, push_cert_nonce); >> > - >> > /* >> > * Clear the status for each ref and see if we need to send >> > * the pack data. >> >> This "Clear the status for each ref" worries me. >> >> The generate_push_cert() function RELIES on ref->status and filters >> out the ref that failed to pass the local check from the generated >> push certificate. If you let the loop (post context of this hunk) >> run, ref->status will be updated by it, so the net effect of this >> patch is that it breaks "non-atomic" case that pushes multiple refs >> and one of ref fails to pass the local check. >> >> IOW, generate_push_cert() MUST be called before this loop "clears >> the status for each ref" by assigning to ref->status. > > The next block ("Finally, tell the other end!") is what we send > commands to "receive-pack", right after some of the status are reset > to REF_STATUS_OK or REF_STATUS_EXPECTING_REPORT by this chunk of code. > So moving the generate_push_cert() part right before the "Finally, > tell the other end!" part LGTM. Sorry, I do not follow. The loop in question is the one before "Finally tell the other end". The loop ends like so: for (ref = remote_refs; ref; ref = ref->next) { ... if (args->dry_run || !status_report) ref->status = REF_STATUS_OK; else ref->status = REF_STATUS_EXPECTING_REPORT; } and the patch moves a call to generate_push_cert() that looks at remote_refs _after_ this loop, but generate_push_cert() function uses a loop over remote_refs that uses check_to_send_update(), which looks at ref->status's value to decide what to do. Its correct operation relies on ref->status NOT updated by the above loop. The loop prepares the status field so that we can then read and record the response against each ref updates from the other side. The ref->status field is set to EXPECTING_REPORT, later to be updated to REF_STATUS_OK or REF_STATUS_REMOTE_REJECT. We can clobber the original value of ref->status at this point only because the loop depends on the fact that no check_to_send_update() call will happen after the loop (because the ref->status field the helper's operation depends on is already reset for the next phase of operation). The patch that moves generate_push_cert() call below the loop, whether it is before or after the "Finally tell the other end" loop, is therefore fundamentally broken, isn't it? I do not think it would introduce such breakage if we teach generate_push_cert() to pay attention to the atomicity, and that can be done without reordering the calls in send_pack() to break the control flow.
Junio C Hamano <gitster@pobox.com> writes: > > The next block ("Finally, tell the other end!") is what we send > > commands to "receive-pack", right after some of the status are reset > > to REF_STATUS_OK or REF_STATUS_EXPECTING_REPORT by this chunk of code. > > So moving the generate_push_cert() part right before the "Finally, > > tell the other end!" part LGTM. > > Sorry, I do not follow. The loop in question is the one before > "Finally tell the other end". The loop ends like so: > > for (ref = remote_refs; ref; ref = ref->next) { > ... > if (args->dry_run || !status_report) > ref->status = REF_STATUS_OK; > else > ref->status = REF_STATUS_EXPECTING_REPORT; > } > > and the patch moves a call to generate_push_cert() that looks at > remote_refs _after_ this loop, but generate_push_cert() function > uses a loop over remote_refs that uses check_to_send_update(), which > looks at ref->status's value to decide what to do. Its correct > operation relies on ref->status NOT updated by the above loop. > To make it clear, I refactor the Han Xin's patch, quote and add comments as follows (changes on whitespace are ignored): >> /* >> * NEEDSWORK: why does delete-refs have to be so specific to >> * send-pack machinery that set_ref_status_for_push() cannot >> * set this bit for us??? >> */ >> for (ref = remote_refs; ref; ref = ref->next) >> if (ref->deletion && !allow_deleting_refs) >> ref->status = REF_STATUS_REJECT_NODELETE; >> >> - if (!args->dry_run) >> - advertise_shallow_grafts_buf(&req_buf); >> - >> - if (!args->dry_run && push_cert_nonce) >> - cmds_sent = generate_push_cert(&req_buf, remote_refs, args, >> - cap_buf.buf, push_cert_nonce); >> - >> /* >> * Clear the status for each ref and see if we need to send >> * the pack data. >> */ >> for (ref = remote_refs; ref; ref = ref->next) { >> switch (check_to_send_update(ref, args)) { >> case 0: /* no error */ >> break; >> case CHECK_REF_STATUS_REJECTED: >> /* >> * When we know the server would reject a ref update if >> * we were to send it and we're trying to send the refs >> * atomically, abort the whole operation. >> */ >> if (use_atomic) { >> strbuf_release(&req_buf); >> strbuf_release(&cap_buf); >> reject_atomic_push(remote_refs, args->send_mirror); >> error("atomic push failed for ref %s. status: %d\n", >> ref->name, ref->status); >> return args->porcelain ? 0 : -1; >> } >> /* else fallthrough */ >> default: >> continue; >> } >> if (!ref->deletion) >> need_pack_data = 1; >> >> if (args->dry_run || !status_report) >> ref->status = REF_STATUS_OK; >> else >> ref->status = REF_STATUS_EXPECTING_REPORT; >> } >> >> + if (!args->dry_run) >> + advertise_shallow_grafts_buf(&req_buf); >> + >> + >> /* >> * Finally, tell the other end! >> */ >> + if (!args->dry_run && push_cert_nonce) >> + cmds_sent = generate_push_cert(&req_buf, remote_refs, args, >> + cap_buf.buf, push_cert_nonce); Moving `generate_push_cert()` here, will: 1. Increase the perforcemance a little bit for failed atomic push. 2. Make it clear that the commands will be sent only once. For GPG-signed push, commands will be sent via `generate_push_cert()`, and for non-GPG-signed push, commands will be sent using the following code. 3. For GPG-signed push, won't run the following loop. >> + else if (!args->dry_run) >> for (ref = remote_refs; ref; ref = ref->next) { >> char *old_hex, *new_hex; >> >> - if (args->dry_run || push_cert_nonce) >> - continue; >> - >> if (check_to_send_update(ref, args) < 0) >> continue; In the original "Finally, tell the other end" block, the function `check_to_send_update()` is also called for non-PGP-signed push. The 'ref->status' changed by the "Clear the status" block won't make any difference for the return value of the function `check_to_send_update()`. Refs even with status REF_STATUS_OK and REF_STATUS_EXPECTING_REPORT will be sent to the server side. >> >> old_hex = oid_to_hex(&ref->old_oid); >> new_hex = oid_to_hex(&ref->new_oid); >> if (!cmds_sent) { >> packet_buf_write(&req_buf, >> "%s %s %s%c%s", >> old_hex, new_hex, ref->name, 0, >> cap_buf.buf); >> cmds_sent = 1; >> } else { >> packet_buf_write(&req_buf, "%s %s %s", >> old_hex, new_hex, ref->name); >> } >> } > The loop prepares the status field so that we can then read and > record the response against each ref updates from the other side. > > The ref->status field is set to EXPECTING_REPORT, later to be > updated to REF_STATUS_OK or REF_STATUS_REMOTE_REJECT. We can > clobber the original value of ref->status at this point only because > the loop depends on the fact that no check_to_send_update() call > will happen after the loop (because the ref->status field the > helper's operation depends on is already reset for the next phase of > operation). The patch that moves generate_push_cert() call below > the loop, whether it is before or after the "Finally tell the other > end" loop, is therefore fundamentally broken, isn't it? > > I do not think it would introduce such breakage if we teach > generate_push_cert() to pay attention to the atomicity, and that can > be done without reordering the calls in send_pack() to break the > control flow.
在 2020/9/16 下午12:37, Junio C Hamano 写道: > Jiang Xin <worldhello.net@gmail.com> writes: > >>>> - >>>> - if (!args->dry_run && push_cert_nonce) >>>> - cmds_sent = generate_push_cert(&req_buf, remote_refs, args, >>>> - cap_buf.buf, push_cert_nonce); >>>> - >>>> /* >>>> * Clear the status for each ref and see if we need to send >>>> * the pack data. >>> This "Clear the status for each ref" worries me. >>> >>> The generate_push_cert() function RELIES on ref->status and filters >>> out the ref that failed to pass the local check from the generated >>> push certificate. If you let the loop (post context of this hunk) >>> run, ref->status will be updated by it, so the net effect of this >>> patch is that it breaks "non-atomic" case that pushes multiple refs >>> and one of ref fails to pass the local check. >>> >>> IOW, generate_push_cert() MUST be called before this loop "clears >>> the status for each ref" by assigning to ref->status. >> The next block ("Finally, tell the other end!") is what we send >> commands to "receive-pack", right after some of the status are reset >> to REF_STATUS_OK or REF_STATUS_EXPECTING_REPORT by this chunk of code. >> So moving the generate_push_cert() part right before the "Finally, >> tell the other end!" part LGTM. > Sorry, I do not follow. The loop in question is the one before > "Finally tell the other end". The loop ends like so: > > for (ref = remote_refs; ref; ref = ref->next) { > ... > if (args->dry_run || !status_report) > ref->status = REF_STATUS_OK; > else > ref->status = REF_STATUS_EXPECTING_REPORT; > } > > and the patch moves a call to generate_push_cert() that looks at > remote_refs _after_ this loop, but generate_push_cert() function > uses a loop over remote_refs that uses check_to_send_update(), which > looks at ref->status's value to decide what to do. Its correct > operation relies on ref->status NOT updated by the above loop. > > The loop prepares the status field so that we can then read and > record the response against each ref updates from the other side. > > The ref->status field is set to EXPECTING_REPORT, later to be > updated to REF_STATUS_OK or REF_STATUS_REMOTE_REJECT. We can > clobber the original value of ref->status at this point only because > the loop depends on the fact that no check_to_send_update() call > will happen after the loop (because the ref->status field the > helper's operation depends on is already reset for the next phase of > operation). The patch that moves generate_push_cert() call below > the loop, whether it is before or after the "Finally tell the other > end" loop, is therefore fundamentally broken, isn't it? > > I do not think it would introduce such breakage if we teach > generate_push_cert() to pay attention to the atomicity, and that can > be done without reordering the calls in send_pack() to break the > control flow. Thank you for your reply. These loops here really confuse me at first. But I found that the main effect of "Clear the status for each ref and see if we need to send the pack data" is to help us do a pre-check on the client side whether the push should be rejected. When the reference should be pushed, whether the status was changed to REF_STATUS_OK or REF_STATUS_EXPECTING_REPORT, it does not seem to affect the result of function generate_push_cert(). check_to_send_update() in generate_push_cert() only filters out references that needn't to be pushed. Just like brian m. carlson said, "that would be a nice change; after all, the user's key may involve a smartcard or a passphrase and avoiding needless hassle for the user would be desirable". It increase the perforcemance a little bit for failed atomic push and make it clear that client side requirements and the other side requirements. If there is something wrong with my understanding, I am very grateful \that you can help me point out the problems.
Jiang Xin <worldhello.net@gmail.com> writes: > In the original "Finally, tell the other end" block, the function > `check_to_send_update()` is also called for non-PGP-signed push. > The 'ref->status' changed by the "Clear the status" block won't > make any difference for the return value of the function > `check_to_send_update()`. Refs even with status REF_STATUS_OK and > REF_STATUS_EXPECTING_REPORT will be sent to the server side. Ah, yes, I re-read the code in check_to_send_update() and you're right that it does the right thing. I however strongly suspect it just happens to do the right thing by accident and not by design. I'd prefer to see a bit more tightening done to the function to clarify the handling of these two values that are omitted from the case arms in the switch statement, perhaps like this, as a preliminary clean-up. As a further clean-up, we probably should stop relying on the 'default' label. There are other REF_STATUS values that are not handled explicitly, among which REF_STATUS_ATOMIC_PUSH_FAILED looks like the most troublesome one. The function will return 0 (success) for ATOMIC_PUSH_FAILED, but the current ordering of the codeflow makes sure check_to_send_update() is *not* called after ref->status is turned into that value and that would be the only thing that may be ensuring the correctness. There may be other ones we are not handling quite right. send-pack.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/send-pack.c b/send-pack.c index 632f1580ca..347fb15633 100644 --- a/send-pack.c +++ b/send-pack.c @@ -244,7 +244,12 @@ static int check_to_send_update(const struct ref *ref, const struct send_pack_ar return CHECK_REF_STATUS_REJECTED; case REF_STATUS_UPTODATE: return CHECK_REF_UPTODATE; + default: + case REF_STATUS_EXPECTING_REPORT: + /* already passed checks on the local side */ + case REF_STATUS_OK: + /* of course this is OK */ return 0; } }
diff --git a/send-pack.c b/send-pack.c index d671ab5d05..58416a6f6d 100644 --- a/send-pack.c +++ b/send-pack.c @@ -447,13 +447,6 @@ int send_pack(struct send_pack_args *args, if (ref->deletion && !allow_deleting_refs) ref->status = REF_STATUS_REJECT_NODELETE; - if (!args->dry_run) - advertise_shallow_grafts_buf(&req_buf); - - if (!args->dry_run && push_cert_nonce) - cmds_sent = generate_push_cert(&req_buf, remote_refs, args, - cap_buf.buf, push_cert_nonce); - /* * Clear the status for each ref and see if we need to send * the pack data. @@ -489,6 +482,13 @@ int send_pack(struct send_pack_args *args, ref->status = REF_STATUS_EXPECTING_REPORT; } + if (!args->dry_run) + advertise_shallow_grafts_buf(&req_buf); + + if (!args->dry_run && push_cert_nonce) + cmds_sent = generate_push_cert(&req_buf, remote_refs, args, + cap_buf.buf, push_cert_nonce); + /* * Finally, tell the other end! */ diff --git a/t/t5534-push-signed.sh b/t/t5534-push-signed.sh index d0fcdc900e..927750a408 100755 --- a/t/t5534-push-signed.sh +++ b/t/t5534-push-signed.sh @@ -273,7 +273,7 @@ test_expect_success GPGSM 'fail without key and heed user.signingkey x509' ' test_cmp expect dst/push-cert-status ' -test_expect_failure GPG 'check atomic push before running GPG' ' +test_expect_success GPG 'check atomic push before running GPG' ' prepare_dst && git -C dst config receive.certnonceseed sekrit && write_script gpg <<-EOF &&