diff mbox series

[v14,2/7] object-file.c: do fsync() and close() before post-write die()

Message ID a327f484f7f7466597930e87686e7156beabdc45.1654871916.git.chiyutianyi@gmail.com (mailing list archive)
State Superseded
Headers show
Series unpack-objects: support streaming blobs to disk | expand

Commit Message

Han Xin June 10, 2022, 2:46 p.m. UTC
From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Change write_loose_object() to do an fsync() and close() before the
oideq() sanity check at the end. This change re-joins code that was
split up by the die() sanity check added in 748af44c63e (sha1_file: be
paranoid when creating loose objects, 2010-02-21).

I don't think that this change matters in itself, if we called die()
it was possible that our data wouldn't fully make it to disk, but in
any case we were writing data that we'd consider corrupted. It's
possible that a subsequent "git fsck" will be less confused now.

The real reason to make this change is that in a subsequent commit
we'll split this code in write_loose_object() into a utility function,
all its callers will want the preceding sanity checks, but not the
"oideq" check. By moving the close_loose_object() earlier it'll be
easier to reason about the introduction of the utility function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-file.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

René Scharfe June 10, 2022, 9:10 p.m. UTC | #1
Am 10.06.22 um 16:46 schrieb Han Xin:
> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>
> Change write_loose_object() to do an fsync() and close() before the
> oideq() sanity check at the end. This change re-joins code that was
> split up by the die() sanity check added in 748af44c63e (sha1_file: be
> paranoid when creating loose objects, 2010-02-21).
>
> I don't think that this change matters in itself, if we called die()
> it was possible that our data wouldn't fully make it to disk, but in
> any case we were writing data that we'd consider corrupted. It's
> possible that a subsequent "git fsck" will be less confused now.

This is done before renaming the file, so git fsck is going to see (at
most) a tmp_obj_?????? file, which it ignores in either case, right?

> The real reason to make this change is that in a subsequent commit
> we'll split this code in write_loose_object() into a utility function,
> all its callers will want the preceding sanity checks, but not the
> "oideq" check. By moving the close_loose_object() earlier it'll be
> easier to reason about the introduction of the utility function.

This sounds like a patch would move the close_loose_object() call to
some other place, but that's not the case.  The sequence below (starting
from the close_loose_object() call) is still present after applying the
whole series, so it seems this patch is not necessary.

>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  object-file.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/object-file.c b/object-file.c
> index 79eb8339b6..e4a83012ba 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -2012,12 +2012,12 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
>  		die(_("deflateEnd on object %s failed (%d)"), oid_to_hex(oid),
>  		    ret);
>  	the_hash_algo->final_oid_fn(&parano_oid, &c);
> +	close_loose_object(fd, tmp_file.buf);
> +
>  	if (!oideq(oid, &parano_oid))
>  		die(_("confused by unstable object source data for %s"),
>  		    oid_to_hex(oid));
>
> -	close_loose_object(fd, tmp_file.buf);
> -
>  	if (mtime) {
>  		struct utimbuf utb;
>  		utb.actime = mtime;
Junio C Hamano June 10, 2022, 9:33 p.m. UTC | #2
René Scharfe <l.s.r@web.de> writes:

> Am 10.06.22 um 16:46 schrieb Han Xin:
>> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>>
>> Change write_loose_object() to do an fsync() and close() before the
>> oideq() sanity check at the end. This change re-joins code that was
>> split up by the die() sanity check added in 748af44c63e (sha1_file: be
>> paranoid when creating loose objects, 2010-02-21).
>>
>> I don't think that this change matters in itself, if we called die()
>> it was possible that our data wouldn't fully make it to disk, but in
>> any case we were writing data that we'd consider corrupted. It's
>> possible that a subsequent "git fsck" will be less confused now.
>
> This is done before renaming the file, so git fsck is going to see (at
> most) a tmp_obj_?????? file, which it ignores in either case, right?

Yes, I thought I pointed that out in my review on the previous
round, but I missed that it was still here in this round X-<.

Thanks for noticing.
Han Xin June 11, 2022, 1:50 a.m. UTC | #3
On Sat, Jun 11, 2022 at 5:33 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> René Scharfe <l.s.r@web.de> writes:
>
> > Am 10.06.22 um 16:46 schrieb Han Xin:
> >> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> >>
> >> Change write_loose_object() to do an fsync() and close() before the
> >> oideq() sanity check at the end. This change re-joins code that was
> >> split up by the die() sanity check added in 748af44c63e (sha1_file: be
> >> paranoid when creating loose objects, 2010-02-21).
> >>
> >> I don't think that this change matters in itself, if we called die()
> >> it was possible that our data wouldn't fully make it to disk, but in
> >> any case we were writing data that we'd consider corrupted. It's
> >> possible that a subsequent "git fsck" will be less confused now.
> >
> > This is done before renaming the file, so git fsck is going to see (at
> > most) a tmp_obj_?????? file, which it ignores in either case, right?
>
> Yes, I thought I pointed that out in my review on the previous
> round, but I missed that it was still here in this round X-<.
>
> Thanks for noticing.
>

Yes, agree with both of you, I'll be removing this patch in the next series.

This patch was first introduced in v10[1], close_loose_object() was moved
to end_loose_object_common(), but it was put back in v12[2]. It is indeed
no longer necessary now.

1. https://lore.kernel.org/git/patch-v10-3.6-0e33d2a6e35-20220204T135538Z-avarab@gmail.com/
2. https://lore.kernel.org/git/patch-v12-2.8-54060eb8c6b-20220329T135446Z-avarab@gmail.com/

Thank you both.
-Han Xin

-Han Xin
diff mbox series

Patch

diff --git a/object-file.c b/object-file.c
index 79eb8339b6..e4a83012ba 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2012,12 +2012,12 @@  static int write_loose_object(const struct object_id *oid, char *hdr,
 		die(_("deflateEnd on object %s failed (%d)"), oid_to_hex(oid),
 		    ret);
 	the_hash_algo->final_oid_fn(&parano_oid, &c);
+	close_loose_object(fd, tmp_file.buf);
+
 	if (!oideq(oid, &parano_oid))
 		die(_("confused by unstable object source data for %s"),
 		    oid_to_hex(oid));
 
-	close_loose_object(fd, tmp_file.buf);
-
 	if (mtime) {
 		struct utimbuf utb;
 		utb.actime = mtime;