Message ID | a327f484f7f7466597930e87686e7156beabdc45.1654871916.git.chiyutianyi@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | unpack-objects: support streaming blobs to disk | expand |
Am 10.06.22 um 16:46 schrieb Han Xin: > From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > > Change write_loose_object() to do an fsync() and close() before the > oideq() sanity check at the end. This change re-joins code that was > split up by the die() sanity check added in 748af44c63e (sha1_file: be > paranoid when creating loose objects, 2010-02-21). > > I don't think that this change matters in itself, if we called die() > it was possible that our data wouldn't fully make it to disk, but in > any case we were writing data that we'd consider corrupted. It's > possible that a subsequent "git fsck" will be less confused now. This is done before renaming the file, so git fsck is going to see (at most) a tmp_obj_?????? file, which it ignores in either case, right? > The real reason to make this change is that in a subsequent commit > we'll split this code in write_loose_object() into a utility function, > all its callers will want the preceding sanity checks, but not the > "oideq" check. By moving the close_loose_object() earlier it'll be > easier to reason about the introduction of the utility function. This sounds like a patch would move the close_loose_object() call to some other place, but that's not the case. The sequence below (starting from the close_loose_object() call) is still present after applying the whole series, so it seems this patch is not necessary. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > object-file.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/object-file.c b/object-file.c > index 79eb8339b6..e4a83012ba 100644 > --- a/object-file.c > +++ b/object-file.c > @@ -2012,12 +2012,12 @@ static int write_loose_object(const struct object_id *oid, char *hdr, > die(_("deflateEnd on object %s failed (%d)"), oid_to_hex(oid), > ret); > the_hash_algo->final_oid_fn(¶no_oid, &c); > + close_loose_object(fd, tmp_file.buf); > + > if (!oideq(oid, ¶no_oid)) > die(_("confused by unstable object source data for %s"), > oid_to_hex(oid)); > > - close_loose_object(fd, tmp_file.buf); > - > if (mtime) { > struct utimbuf utb; > utb.actime = mtime;
René Scharfe <l.s.r@web.de> writes: > Am 10.06.22 um 16:46 schrieb Han Xin: >> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> >> >> Change write_loose_object() to do an fsync() and close() before the >> oideq() sanity check at the end. This change re-joins code that was >> split up by the die() sanity check added in 748af44c63e (sha1_file: be >> paranoid when creating loose objects, 2010-02-21). >> >> I don't think that this change matters in itself, if we called die() >> it was possible that our data wouldn't fully make it to disk, but in >> any case we were writing data that we'd consider corrupted. It's >> possible that a subsequent "git fsck" will be less confused now. > > This is done before renaming the file, so git fsck is going to see (at > most) a tmp_obj_?????? file, which it ignores in either case, right? Yes, I thought I pointed that out in my review on the previous round, but I missed that it was still here in this round X-<. Thanks for noticing.
On Sat, Jun 11, 2022 at 5:33 AM Junio C Hamano <gitster@pobox.com> wrote: > > René Scharfe <l.s.r@web.de> writes: > > > Am 10.06.22 um 16:46 schrieb Han Xin: > >> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > >> > >> Change write_loose_object() to do an fsync() and close() before the > >> oideq() sanity check at the end. This change re-joins code that was > >> split up by the die() sanity check added in 748af44c63e (sha1_file: be > >> paranoid when creating loose objects, 2010-02-21). > >> > >> I don't think that this change matters in itself, if we called die() > >> it was possible that our data wouldn't fully make it to disk, but in > >> any case we were writing data that we'd consider corrupted. It's > >> possible that a subsequent "git fsck" will be less confused now. > > > > This is done before renaming the file, so git fsck is going to see (at > > most) a tmp_obj_?????? file, which it ignores in either case, right? > > Yes, I thought I pointed that out in my review on the previous > round, but I missed that it was still here in this round X-<. > > Thanks for noticing. > Yes, agree with both of you, I'll be removing this patch in the next series. This patch was first introduced in v10[1], close_loose_object() was moved to end_loose_object_common(), but it was put back in v12[2]. It is indeed no longer necessary now. 1. https://lore.kernel.org/git/patch-v10-3.6-0e33d2a6e35-20220204T135538Z-avarab@gmail.com/ 2. https://lore.kernel.org/git/patch-v12-2.8-54060eb8c6b-20220329T135446Z-avarab@gmail.com/ Thank you both. -Han Xin -Han Xin
diff --git a/object-file.c b/object-file.c index 79eb8339b6..e4a83012ba 100644 --- a/object-file.c +++ b/object-file.c @@ -2012,12 +2012,12 @@ static int write_loose_object(const struct object_id *oid, char *hdr, die(_("deflateEnd on object %s failed (%d)"), oid_to_hex(oid), ret); the_hash_algo->final_oid_fn(¶no_oid, &c); + close_loose_object(fd, tmp_file.buf); + if (!oideq(oid, ¶no_oid)) die(_("confused by unstable object source data for %s"), oid_to_hex(oid)); - close_loose_object(fd, tmp_file.buf); - if (mtime) { struct utimbuf utb; utb.actime = mtime;