Message ID | 6722ade6-971e-7ecc-e8f0-7f595ca0b0ff@web.de (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | upload-pack: use buffered I/O to talk to rev-list | expand |
One suggestion here: On Sun, Aug 2, 2020 at 7:41 AM René Scharfe <l.s.r@web.de> wrote: > Like f0bca72dc77 (send-pack: use buffered I/O to talk to pack-objects, > 2016-06-08), significantly reduce the number of system calls and > simplify the code for sending object IDs to rev-list by using stdio's > buffering and handling errors after the loops. > > Signed-off-by: René Scharfe <l.s.r@web.de> > --- > upload-pack.c | 21 +++++++-------------- > 1 file changed, 7 insertions(+), 14 deletions(-) > > diff --git a/upload-pack.c b/upload-pack.c > index 86737410709..9f616c2c6a6 100644 > --- a/upload-pack.c > +++ b/upload-pack.c [snip] > @@ -640,12 +636,11 @@ static int do_reachable_revlist(struct child_process *cmd, > } > if (reachable && o->type == OBJ_COMMIT) > o->flags |= TMP_MARK; > - memcpy(namebuf, oid_to_hex(&o->oid), hexsz); > - if (write_in_full(cmd->in, namebuf, hexsz + 1) < 0) > - goto error; > + fprintf(cmd_in, "%s\n", oid_to_hex(&o->oid)); The fprintf() call here *can* return an error, e.g., if the connection has died. If it does, it should set things up so that a later ferror(cmd_in) returns true. > } > - close(cmd->in); > cmd->in = -1; > + if (fclose(cmd_in)) > + goto error; The fclose() call doesn't necessarily check ferror(). (The FreeBSD stdio in particular definitely does not.) It might be better to use: failure = ferror(cmd_in); failure |= fclose(cmd_in); if (failure) ... here, or similar. (The temporary variable is not needed, but someone might assume `if (ferror(fp) | fclose(fp))` is a typo for `if (ferror(fp) || fclose(fp))`.) (Note: my sample isn't properly indented as gmail does not let me insert tabs easily) Chris
Am 02.08.20 um 18:03 schrieb Chris Torek: > One suggestion here: > > On Sun, Aug 2, 2020 at 7:41 AM René Scharfe <l.s.r@web.de> wrote: >> Like f0bca72dc77 (send-pack: use buffered I/O to talk to pack-objects, >> 2016-06-08), significantly reduce the number of system calls and >> simplify the code for sending object IDs to rev-list by using stdio's >> buffering and handling errors after the loops. >> >> Signed-off-by: René Scharfe <l.s.r@web.de> >> --- >> upload-pack.c | 21 +++++++-------------- >> 1 file changed, 7 insertions(+), 14 deletions(-) >> >> diff --git a/upload-pack.c b/upload-pack.c >> index 86737410709..9f616c2c6a6 100644 >> --- a/upload-pack.c >> +++ b/upload-pack.c > > [snip] > >> @@ -640,12 +636,11 @@ static int do_reachable_revlist(struct child_process *cmd, >> } >> if (reachable && o->type == OBJ_COMMIT) >> o->flags |= TMP_MARK; >> - memcpy(namebuf, oid_to_hex(&o->oid), hexsz); >> - if (write_in_full(cmd->in, namebuf, hexsz + 1) < 0) >> - goto error; >> + fprintf(cmd_in, "%s\n", oid_to_hex(&o->oid)); > > The fprintf() call here *can* return an error, e.g., if the > connection has died. If it does, it should set things up so that > a later ferror(cmd_in) returns true. > >> } >> - close(cmd->in); >> cmd->in = -1; >> + if (fclose(cmd_in)) >> + goto error; > > The fclose() call doesn't necessarily check ferror(). (The > FreeBSD stdio in particular definitely does not.) It might > be better to use: > > failure = ferror(cmd_in); > failure |= fclose(cmd_in); > if (failure) ... > > here, or similar. (The temporary variable is not needed, > but someone might assume `if (ferror(fp) | fclose(fp))` is > a typo for `if (ferror(fp) || fclose(fp))`.) Thanks, didn't know that. That's awful. So the sentence "The fclose() and fdclose() functions may also fail and set errno for any of the errors specified for fflush(3)." from the FreeBSD manpage for fclose(3) actually means that while it will flush, it is free to ignore any flush errors? Or do you mean that fflush() can succeed on a stream that has its error indicator set? In any case, we'd then better add a function that flushes the buffer, closes the stream and reports errors in its return code and errno -- i.e. a sane fclose(). René
On Mon, Aug 3, 2020 at 7:00 AM René Scharfe <l.s.r@web.de> wrote: > > Am 02.08.20 um 18:03 schrieb Chris Torek: > > The fclose() call doesn't necessarily check ferror(). (The > > FreeBSD stdio in particular definitely does not.) > > Thanks, didn't know that. That's awful. So the sentence "The fclose() > and fdclose() functions may also fail and set errno for any of the > errors specified for fflush(3)." from the FreeBSD manpage for fclose(3) > actually means that while it will flush, it is free to ignore any > flush errors? > > Or do you mean that fflush() can succeed on a stream that has its error > indicator set? The latter. You have to get particularly (un?)lucky here: say the buffer size is n, and the *last* write operation (fprintf, whatever) filled the buffer mod n and called fflush internally and that failed. Then at the time you call fclose() the buffer is empty. There is nothing to flush, so we just get the result of the system's close() call. > In any case, we'd then better add a function that flushes the buffer, > closes the stream and reports errors in its return code and errno -- > i.e. a sane fclose(). Unfortunately the global-variable nature of errno means it may be clobbered by the time you inspect it here, again with the same sort of issue above. It might be nice if C's error conventions were more like Go's, and stdio retained an earlier error, but this stuff all dates back to 16-bit "int" and 512-byte buffers and no room for nice stuff. :-) Chris
Am 02.08.20 um 18:03 schrieb Chris Torek: > One suggestion here: > > On Sun, Aug 2, 2020 at 7:41 AM René Scharfe <l.s.r@web.de> wrote: >> Like f0bca72dc77 (send-pack: use buffered I/O to talk to pack-objects, >> 2016-06-08), significantly reduce the number of system calls and >> simplify the code for sending object IDs to rev-list by using stdio's >> buffering and handling errors after the loops. >> >> Signed-off-by: René Scharfe <l.s.r@web.de> >> --- >> upload-pack.c | 21 +++++++-------------- >> 1 file changed, 7 insertions(+), 14 deletions(-) >> >> diff --git a/upload-pack.c b/upload-pack.c >> index 86737410709..9f616c2c6a6 100644 >> --- a/upload-pack.c >> +++ b/upload-pack.c > > [snip] > >> @@ -640,12 +636,11 @@ static int do_reachable_revlist(struct child_process *cmd, >> } >> if (reachable && o->type == OBJ_COMMIT) >> o->flags |= TMP_MARK; >> - memcpy(namebuf, oid_to_hex(&o->oid), hexsz); >> - if (write_in_full(cmd->in, namebuf, hexsz + 1) < 0) >> - goto error; >> + fprintf(cmd_in, "%s\n", oid_to_hex(&o->oid)); > > The fprintf() call here *can* return an error, e.g., if the > connection has died. If it does, it should set things up so that > a later ferror(cmd_in) returns true. True. We need an explicit test after each fprintf anyway because SIGPIPE may be ignored, and then writing fails with EPIPE. On Windows, this is doubly important because we do not have SIGPIPE at all (and always see EPIPE), but we see EPIPE only on the first failed write; subsequent writes produce EINVAL. -- Hannes
diff --git a/upload-pack.c b/upload-pack.c index 86737410709..9f616c2c6a6 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -595,10 +595,9 @@ static int do_reachable_revlist(struct child_process *cmd, static const char *argv[] = { "rev-list", "--stdin", NULL, }; + FILE *cmd_in; struct object *o; - char namebuf[GIT_MAX_HEXSZ + 2]; /* ^ + hash + LF */ int i; - const unsigned hexsz = the_hash_algo->hexsz; cmd->argv = argv; cmd->git_cmd = 1; @@ -616,8 +615,8 @@ static int do_reachable_revlist(struct child_process *cmd, if (start_command(cmd)) goto error; - namebuf[0] = '^'; - namebuf[hexsz + 1] = '\n'; + cmd_in = xfdopen(cmd->in, "w"); + for (i = get_max_object_index(); 0 < i; ) { o = get_indexed_object(--i); if (!o) @@ -626,11 +625,8 @@ static int do_reachable_revlist(struct child_process *cmd, o->flags &= ~TMP_MARK; if (!is_our_ref(o, allow_uor)) continue; - memcpy(namebuf + 1, oid_to_hex(&o->oid), hexsz); - if (write_in_full(cmd->in, namebuf, hexsz + 2) < 0) - goto error; + fprintf(cmd_in, "^%s\n", oid_to_hex(&o->oid)); } - namebuf[hexsz] = '\n'; for (i = 0; i < src->nr; i++) { o = src->objects[i].item; if (is_our_ref(o, allow_uor)) { @@ -640,12 +636,11 @@ static int do_reachable_revlist(struct child_process *cmd, } if (reachable && o->type == OBJ_COMMIT) o->flags |= TMP_MARK; - memcpy(namebuf, oid_to_hex(&o->oid), hexsz); - if (write_in_full(cmd->in, namebuf, hexsz + 1) < 0) - goto error; + fprintf(cmd_in, "%s\n", oid_to_hex(&o->oid)); } - close(cmd->in); cmd->in = -1; + if (fclose(cmd_in)) + goto error; sigchain_pop(SIGPIPE); return 0; @@ -653,8 +648,6 @@ static int do_reachable_revlist(struct child_process *cmd, error: sigchain_pop(SIGPIPE); - if (cmd->in >= 0) - close(cmd->in); if (cmd->out >= 0) close(cmd->out); return -1;
Like f0bca72dc77 (send-pack: use buffered I/O to talk to pack-objects, 2016-06-08), significantly reduce the number of system calls and simplify the code for sending object IDs to rev-list by using stdio's buffering and handling errors after the loops. Signed-off-by: René Scharfe <l.s.r@web.de> --- upload-pack.c | 21 +++++++-------------- 1 file changed, 7 insertions(+), 14 deletions(-) -- 2.28.0