Message ID | 20240226220539.3494-4-randall.becker@nexbridge.ca (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Change xwrite() to write_in_full() in builtins. | expand |
"Randall S. Becker" <the.n.e.key@gmail.com> writes: > From: "Randall S. Becker" <rsbecker@nexbridge.com> > > This change is required because some platforms do not support file writes of > arbitrary sizes (e.g, NonStop). xwrite ends up truncating the output to the > maximum single I/O size possible for the destination device. The result of > write_in_full() is also passed to the caller, which was previously ignored. This one smells more like a theoretical issue than realistic, in that these writes are done only with .hexsz (either 40 or 64) bytes oid string, or a single byte "\n", for either of which it is hard to imagine that it is even remotely close to platform "maximum single I/O size". But we'd need to look for the error return anyway, so switching to write_in_full() while we are doing so is also good. > Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com> > --- > builtin/repack.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/builtin/repack.c b/builtin/repack.c > index ede36328a3..932d24c60b 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -307,6 +307,7 @@ static int write_oid(const struct object_id *oid, > struct packed_git *pack UNUSED, > uint32_t pos UNUSED, void *data) > { > + int err; > struct child_process *cmd = data; > > if (cmd->in == -1) { > @@ -314,8 +315,12 @@ static int write_oid(const struct object_id *oid, > die(_("could not start pack-objects to repack promisor objects")); > } > > - xwrite(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz); > - xwrite(cmd->in, "\n", 1); > + err = write_in_full(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz); > + if (err <= 0) > + return err; > + err = write_in_full(cmd->in, "\n", 1); > + if (err <= 0) > + return err; > return 0; > }
On Mon, Feb 26, 2024 at 05:05:37PM -0500, Randall S. Becker wrote: > From: "Randall S. Becker" <rsbecker@nexbridge.com> > > This change is required because some platforms do not support file writes of > arbitrary sizes (e.g, NonStop). xwrite ends up truncating the output to the > maximum single I/O size possible for the destination device. The result of > write_in_full() is also passed to the caller, which was previously ignored. These are going to be tiny compared to single-write() I/O limits, I'd think, but in general we should be on guard for the OS returning short reads (this is a pipe and so for most systems PIPE_BUF would guarantee atomicity, I think, but IMHO it is simpler to just make things obviously-correct by looping with write_in_full). So I'd be surprised if this spot was the cause of a visible bug, but I think it's worth changing regardless. The error detection is a separate question, though. I think it is good to check the result of the write here, as an error here means that the child pack-objects misses some objects we wanted it to pack, which could lead to a corrupt repository. But I don't think what you have here is quite enough: > @@ -314,8 +315,12 @@ static int write_oid(const struct object_id *oid, > die(_("could not start pack-objects to repack promisor objects")); > } > > - xwrite(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz); > - xwrite(cmd->in, "\n", 1); > + err = write_in_full(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz); > + if (err <= 0) > + return err; > + err = write_in_full(cmd->in, "\n", 1); > + if (err <= 0) > + return err; > return 0; OK, so we detect the error and return it to the caller. Who is the caller? The only use of this function is in repack_promisor_objects(), which calls: for_each_packed_object(write_oid, &cmd, FOR_EACH_OBJECT_PROMISOR_ONLY); So when we return the error, now for_each_packed_object() will stop traversing, and propagate that error up to the caller. But as we can see above, the caller ignores it! So I think you'd either want to die directly (perhaps using write_or_die). Or you'd need to additionally check the return from for_each_packed_object(). That would also catch cases where that function failed to open a pack (I'm not sure how important that is to this code). But as it is, your patch just causes a write error to truncate the list of oids send to the child process (though that is probably not materially different from the current behavior, as the subsequent calls would presumably fail, too). -Peff
On Tue, Feb 27, 2024 at 03:20:27AM -0500, Jeff King wrote: > OK, so we detect the error and return it to the caller. Who is the > caller? The only use of this function is in repack_promisor_objects(), > which calls: > > for_each_packed_object(write_oid, &cmd, > FOR_EACH_OBJECT_PROMISOR_ONLY); > > So when we return the error, now for_each_packed_object() will stop > traversing, and propagate that error up to the caller. But as we can see > above, the caller ignores it! Oh, one other thing I meant to mention: as the test failure you saw was related to repacking, this seemed like a likely culprit. But the code is only triggered when repacking promisor objects in a partial clone, and it didn't look like the test you posted covered that (it was just about cruft packs). So I would not expect this code to be run at all in the failing test you saw. -Peff
diff --git a/builtin/repack.c b/builtin/repack.c index ede36328a3..932d24c60b 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -307,6 +307,7 @@ static int write_oid(const struct object_id *oid, struct packed_git *pack UNUSED, uint32_t pos UNUSED, void *data) { + int err; struct child_process *cmd = data; if (cmd->in == -1) { @@ -314,8 +315,12 @@ static int write_oid(const struct object_id *oid, die(_("could not start pack-objects to repack promisor objects")); } - xwrite(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz); - xwrite(cmd->in, "\n", 1); + err = write_in_full(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz); + if (err <= 0) + return err; + err = write_in_full(cmd->in, "\n", 1); + if (err <= 0) + return err; return 0; }