Message ID | 20210928001726.2592734-1-mh@glandium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Don't ignore transport_disconnect error codes in fetch and clone | expand |
On Tue, Sep 28 2021, Mike Hommey wrote: > When a remote-helper fails in a way that is not directly visible in the > remote-helper protocol, the helper failure is ignored by git during > fetch or clone. > > For example, a helper cannot directly report an error during an `import` > command (short of sending `feature done` to the fast-import file > descriptor and not sending a `done` later on). > > Or if the helper crashes at the wrong moment, git doesn't notice and > thinks everything went well. > > Signed-off-by: Mike Hommey <mh@glandium.org> > --- > builtin/clone.c | 5 +++-- > builtin/fetch.c | 6 +++--- > 2 files changed, 6 insertions(+), 5 deletions(-) > > What I'm not sure about is whether a message should be explicitly > printed by git itself in those cases. > > diff --git a/builtin/clone.c b/builtin/clone.c > index 66fe66679c..f26fa027c5 100644 > --- a/builtin/clone.c > +++ b/builtin/clone.c > @@ -1398,7 +1398,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) > submodule_progress = transport->progress; > > transport_unlock_pack(transport); > - transport_disconnect(transport); > + err = transport_disconnect(transport); > > if (option_dissociate) { > close_object_store(the_repository->objects); > @@ -1406,7 +1406,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) > } > > junk_mode = JUNK_LEAVE_REPO; > - err = checkout(submodule_progress); > + if (!err) > + err = checkout(submodule_progress); > > free(remote_name); > strbuf_release(&reflog_msg); This seems buggy in some cases, e.g. just because we couldn't close() some final socket we should just not run the checkout at all? Shouldn't we note the disconnect status, but still see if we can do the checkout etc? > diff --git a/builtin/fetch.c b/builtin/fetch.c > index 25740c13df..66bccf6f50 100644 > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -1886,7 +1886,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, > { > struct refspec rs = REFSPEC_INIT_FETCH; > int i; > - int exit_code; > + int exit_code, disconnect_code; > int maybe_prune_tags; > int remote_via_config = remote_is_configured(remote, 0); > > @@ -1952,9 +1952,9 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, > exit_code = do_fetch(gtransport, &rs); > sigchain_pop(SIGPIPE); > refspec_clear(&rs); > - transport_disconnect(gtransport); > + disconnect_code = transport_disconnect(gtransport); > gtransport = NULL; > - return exit_code; > + return exit_code || disconnect_code; > } > > int cmd_fetch(int argc, const char **argv, const char *prefix) This seems like it really needs fixes in other areas, i.e. disconnect_git() returns 0 unconditionally, but should check at least finish_connect(), no? Also once that's done you'll have a logic error here where you're conflating exit and error codes, we should not return -1 from main() (we do in a few cases, but it's nasty)>
Hi, Sorry for the delay, I managed to miss this reply. On Tue, Sep 28, 2021 at 04:56:37AM +0200, Ævar Arnfjörð Bjarmason wrote: > > On Tue, Sep 28 2021, Mike Hommey wrote: > > > When a remote-helper fails in a way that is not directly visible in the > > remote-helper protocol, the helper failure is ignored by git during > > fetch or clone. > > > > For example, a helper cannot directly report an error during an `import` > > command (short of sending `feature done` to the fast-import file > > descriptor and not sending a `done` later on). > > > > Or if the helper crashes at the wrong moment, git doesn't notice and > > thinks everything went well. > > > > Signed-off-by: Mike Hommey <mh@glandium.org> > > --- > > builtin/clone.c | 5 +++-- > > builtin/fetch.c | 6 +++--- > > 2 files changed, 6 insertions(+), 5 deletions(-) > > > > What I'm not sure about is whether a message should be explicitly > > printed by git itself in those cases. > > > > diff --git a/builtin/clone.c b/builtin/clone.c > > index 66fe66679c..f26fa027c5 100644 > > --- a/builtin/clone.c > > +++ b/builtin/clone.c > > @@ -1398,7 +1398,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) > > submodule_progress = transport->progress; > > > > transport_unlock_pack(transport); > > - transport_disconnect(transport); > > + err = transport_disconnect(transport); > > > > if (option_dissociate) { > > close_object_store(the_repository->objects); > > @@ -1406,7 +1406,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) > > } > > > > junk_mode = JUNK_LEAVE_REPO; > > - err = checkout(submodule_progress); > > + if (!err) > > + err = checkout(submodule_progress); > > > > free(remote_name); > > strbuf_release(&reflog_msg); > > This seems buggy in some cases, e.g. just because we couldn't close() > some final socket we should just not run the checkout at all? Shouldn't > we note the disconnect status, but still see if we can do the checkout > etc? I guess it could be seen both ways. In my particular use case, I do want automated testing to be able to catch Leak Sanitizer failures from a remote-helper. > > diff --git a/builtin/fetch.c b/builtin/fetch.c > > index 25740c13df..66bccf6f50 100644 > > --- a/builtin/fetch.c > > +++ b/builtin/fetch.c > > @@ -1886,7 +1886,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, > > { > > struct refspec rs = REFSPEC_INIT_FETCH; > > int i; > > - int exit_code; > > + int exit_code, disconnect_code; > > int maybe_prune_tags; > > int remote_via_config = remote_is_configured(remote, 0); > > > > > > @@ -1952,9 +1952,9 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, > > exit_code = do_fetch(gtransport, &rs); > > sigchain_pop(SIGPIPE); > > refspec_clear(&rs); > > - transport_disconnect(gtransport); > > + disconnect_code = transport_disconnect(gtransport); > > gtransport = NULL; > > - return exit_code; > > + return exit_code || disconnect_code; > > } > > > > int cmd_fetch(int argc, const char **argv, const char *prefix) > > This seems like it really needs fixes in other areas, > i.e. disconnect_git() returns 0 unconditionally, but should check at > least finish_connect(), no? > > Also once that's done you'll have a logic error here where you're > conflating exit and error codes, we should not return -1 from main() (we > do in a few cases, but it's nasty)> transport_disconnect returns the remote-helper exit code for external helpers, though, but I guess we don't necessarily need to return that particular exit code. Mike
On Tue, Oct 05 2021, Mike Hommey wrote: > Hi, > > Sorry for the delay, I managed to miss this reply. > > On Tue, Sep 28, 2021 at 04:56:37AM +0200, Ævar Arnfjörð Bjarmason wrote: >> >> On Tue, Sep 28 2021, Mike Hommey wrote: >> >> > When a remote-helper fails in a way that is not directly visible in the >> > remote-helper protocol, the helper failure is ignored by git during >> > fetch or clone. >> > >> > For example, a helper cannot directly report an error during an `import` >> > command (short of sending `feature done` to the fast-import file >> > descriptor and not sending a `done` later on). >> > >> > Or if the helper crashes at the wrong moment, git doesn't notice and >> > thinks everything went well. >> > >> > Signed-off-by: Mike Hommey <mh@glandium.org> >> > --- >> > builtin/clone.c | 5 +++-- >> > builtin/fetch.c | 6 +++--- >> > 2 files changed, 6 insertions(+), 5 deletions(-) >> > >> > What I'm not sure about is whether a message should be explicitly >> > printed by git itself in those cases. >> > >> > diff --git a/builtin/clone.c b/builtin/clone.c >> > index 66fe66679c..f26fa027c5 100644 >> > --- a/builtin/clone.c >> > +++ b/builtin/clone.c >> > @@ -1398,7 +1398,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) >> > submodule_progress = transport->progress; >> > >> > transport_unlock_pack(transport); >> > - transport_disconnect(transport); >> > + err = transport_disconnect(transport); >> > >> > if (option_dissociate) { >> > close_object_store(the_repository->objects); >> > @@ -1406,7 +1406,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) >> > } >> > >> > junk_mode = JUNK_LEAVE_REPO; >> > - err = checkout(submodule_progress); >> > + if (!err) >> > + err = checkout(submodule_progress); >> > >> > free(remote_name); >> > strbuf_release(&reflog_msg); >> >> This seems buggy in some cases, e.g. just because we couldn't close() >> some final socket we should just not run the checkout at all? Shouldn't >> we note the disconnect status, but still see if we can do the checkout >> etc? > > I guess it could be seen both ways. In my particular use case, I do want > automated testing to be able to catch Leak Sanitizer failures from a > remote-helper. I'd like that too, but I'm pointing out something unrelated (or at least I think I am), which is that this is changing the equivalent of this code: int fd = open(...); close(fd); do_stuff(); To be: int err; int fd = open(...); /* check err here too, but whatever, pseuodocode() */ if (!close(fd)) do_stuff(); But just with sockets, i.e. just because transport_disconnect() had *some* error are we really confident that checkout() should not be run? *Maybe*, but at the very least shouldn't we distinguish some of those error states? Isn't any serious protocol error going to be caught earlier transport_disconnect() is just going to give us the equivalent of not being able to cleanly close the socket? Except as noted in my second comment it doesn't, since we ignore disconnect_git(). At this point I can't recall what we do check, and I intentionally haven't re-looked, but that's also a suggestion for your commit message :) I.e. what error states are going to be different now, when don't we proceed because of a disconnect_git() failure? >> > diff --git a/builtin/fetch.c b/builtin/fetch.c >> > index 25740c13df..66bccf6f50 100644 >> > --- a/builtin/fetch.c >> > +++ b/builtin/fetch.c >> > @@ -1886,7 +1886,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, >> > { >> > struct refspec rs = REFSPEC_INIT_FETCH; >> > int i; >> > - int exit_code; >> > + int exit_code, disconnect_code; >> > int maybe_prune_tags; >> > int remote_via_config = remote_is_configured(remote, 0); >> > >> >> >> > @@ -1952,9 +1952,9 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, >> > exit_code = do_fetch(gtransport, &rs); >> > sigchain_pop(SIGPIPE); >> > refspec_clear(&rs); >> > - transport_disconnect(gtransport); >> > + disconnect_code = transport_disconnect(gtransport); >> > gtransport = NULL; >> > - return exit_code; >> > + return exit_code || disconnect_code; >> > } >> > >> > int cmd_fetch(int argc, const char **argv, const char *prefix) >> >> This seems like it really needs fixes in other areas, >> i.e. disconnect_git() returns 0 unconditionally, but should check at >> least finish_connect(), no? >> >> Also once that's done you'll have a logic error here where you're >> conflating exit and error codes, we should not return -1 from main() (we >> do in a few cases, but it's nasty)> > > transport_disconnect returns the remote-helper exit code for external > helpers, though, but I guess we don't necessarily need to return that > particular exit code. Right, some of these functions are quite naughty about that, and as noted we get it wrong in a lot of existing cases. But it is unportable (non-standard C, but POSIX plasters over it), i.e. a return of -1 from main() is interpreted as a wait() status in POSIX, and results in a return of -1 & 0xff == 255. But for new code let's try to get it right, which in this case is going to be transport_disconnect() being being a syscall error code, and the "exit code" being a return code, which should not be mixed, no?
diff --git a/builtin/clone.c b/builtin/clone.c index 66fe66679c..f26fa027c5 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -1398,7 +1398,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) submodule_progress = transport->progress; transport_unlock_pack(transport); - transport_disconnect(transport); + err = transport_disconnect(transport); if (option_dissociate) { close_object_store(the_repository->objects); @@ -1406,7 +1406,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } junk_mode = JUNK_LEAVE_REPO; - err = checkout(submodule_progress); + if (!err) + err = checkout(submodule_progress); free(remote_name); strbuf_release(&reflog_msg); diff --git a/builtin/fetch.c b/builtin/fetch.c index 25740c13df..66bccf6f50 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1886,7 +1886,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, { struct refspec rs = REFSPEC_INIT_FETCH; int i; - int exit_code; + int exit_code, disconnect_code; int maybe_prune_tags; int remote_via_config = remote_is_configured(remote, 0); @@ -1952,9 +1952,9 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, exit_code = do_fetch(gtransport, &rs); sigchain_pop(SIGPIPE); refspec_clear(&rs); - transport_disconnect(gtransport); + disconnect_code = transport_disconnect(gtransport); gtransport = NULL; - return exit_code; + return exit_code || disconnect_code; } int cmd_fetch(int argc, const char **argv, const char *prefix)
When a remote-helper fails in a way that is not directly visible in the remote-helper protocol, the helper failure is ignored by git during fetch or clone. For example, a helper cannot directly report an error during an `import` command (short of sending `feature done` to the fast-import file descriptor and not sending a `done` later on). Or if the helper crashes at the wrong moment, git doesn't notice and thinks everything went well. Signed-off-by: Mike Hommey <mh@glandium.org> --- builtin/clone.c | 5 +++-- builtin/fetch.c | 6 +++--- 2 files changed, 6 insertions(+), 5 deletions(-) What I'm not sure about is whether a message should be explicitly printed by git itself in those cases.