Message ID | 20221119201213.2398081-1-e@80x24.org (mailing list archive) |
---|---|
State | Accepted |
Commit | 69747653523afa3322e0f8dd6a5a7d30184694c3 |
Headers | show |
Series | prune: quiet ENOENT on missing directories | expand |
Eric Wong <e@80x24.org> writes: > $GIT_DIR/objects/pack may be removed to save inodes in shared > repositories. Quiet down prune in cases where either > $GIT_DIR/objects or $GIT_DIR/objects/pack is non-existent, Wouldn't setup.c::is_git_directory() say "nope, you do not have a repository there" if you are missing $GIT_DIR/objects? So I suspect that the only case this matters in practice is a missing pack/ subdirectory. I agree that silently ignoring missing objects/pack/ is perfectly fine, whether we auto-vivify it when we actually create a pack. > but emit the system error in other cases to help users diagnose > permissions problems or resource constraints. OK. > @@ -127,7 +127,9 @@ static void remove_temporary_files(const char *path) > > dir = opendir(path); > if (!dir) { > - fprintf(stderr, "Unable to open directory %s\n", path); > + if (errno != ENOENT) > + fprintf(stderr, "Unable to open directory %s: %s\n", > + path, strerror(errno)); > return; > } This is called twice, with $GIT_OBJECT_DIRECTORY and its pack subdirectory, as it does not recurse. This is a tangent, I have to wonder how effective the first call would be, though. When writing a loose object file, we compute its object name first in-core and determine the final filename, create a temporary file in the same directory as the final file, write into it and then finally rename the temporary to the final name. The fan-out $GIT_OBJECT_DIRECTORY/??/ directories may have temporary files left when such a process crashed, but do we create cruft "git prune" should remove in $GIT_OBJECT_DIRECTORY/ itself? > diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh > index 8ae314af58..d65a5f94b4 100755 > --- a/t/t5304-prune.sh > +++ b/t/t5304-prune.sh > @@ -29,6 +29,14 @@ test_expect_success setup ' > git gc > ' > > +test_expect_success 'bare repo prune is quiet without $GIT_DIR/objects/pack' ' > + git clone -q --shared --template= --bare . bare.git && > + rmdir bare.git/objects/pack && > + git --git-dir=bare.git prune --no-progress 2>prune.err && > + test_must_be_empty prune.err && > + rm -r bare.git prune.err > +' > + > test_expect_success 'prune stale packs' ' > orig_pack=$(echo .git/objects/pack/*.pack) && > >.git/objects/tmp_1.pack &&
Junio C Hamano <gitster@pobox.com> wrote: > Eric Wong <e@80x24.org> writes: > > > $GIT_DIR/objects/pack may be removed to save inodes in shared > > repositories. Quiet down prune in cases where either > > $GIT_DIR/objects or $GIT_DIR/objects/pack is non-existent, > > Wouldn't setup.c::is_git_directory() say "nope, you do not have a > repository there" if you are missing $GIT_DIR/objects? So I suspect > that the only case this matters in practice is a missing pack/ > subdirectory. Right. Removing $GIT_DIR/objects isn't currently OK, but maybe someday it could be... Supporting missing pack/ is the primary reason for this change, but making a small step towards allowing objects/-free $GIT_DIR doesn't seem harmful. > I agree that silently ignoring missing objects/pack/ is perfectly > fine, whether we auto-vivify it when we actually create a pack. > > > but emit the system error in other cases to help users diagnose > > permissions problems or resource constraints. > > OK. > > > @@ -127,7 +127,9 @@ static void remove_temporary_files(const char *path) > > > > dir = opendir(path); > > if (!dir) { > > - fprintf(stderr, "Unable to open directory %s\n", path); > > + if (errno != ENOENT) > > + fprintf(stderr, "Unable to open directory %s: %s\n", > > + path, strerror(errno)); > > return; > > } > > This is called twice, with $GIT_OBJECT_DIRECTORY and its pack > subdirectory, as it does not recurse. Right. > This is a tangent, I have to wonder how effective the first call > would be, though. When writing a loose object file, we compute its > object name first in-core and determine the final filename, create a > temporary file in the same directory as the final file, write into > it and then finally rename the temporary to the final name. The > fan-out $GIT_OBJECT_DIRECTORY/??/ directories may have temporary > files left when such a process crashed, but do we create cruft "git > prune" should remove in $GIT_OBJECT_DIRECTORY/ itself? Good question, perhaps this could be a followup: diff --git a/builtin/prune.c b/builtin/prune.c index 2719220108..041c45ecbe 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -188,7 +188,6 @@ int cmd_prune(int argc, const char **argv, const char *prefix) prune_cruft, prune_subdir, &revs); prune_packed_objects(show_only ? PRUNE_PACKED_DRY_RUN : 0); - remove_temporary_files(get_object_directory()); s = mkpathdup("%s/pack", get_object_directory()); remove_temporary_files(s); free(s); OTOH, perhaps there's some 3rd-party tools (e.g. backup tools) that leave stuff in top-level objects/ and we'd risk breaking a rare setup via ENOSPC.
On Sat, Nov 19 2022, Eric Wong wrote: > $GIT_DIR/objects/pack may be removed to save inodes in shared > repositories. Quiet down prune in cases where either > $GIT_DIR/objects or $GIT_DIR/objects/pack is non-existent, > but emit the system error in other cases to help users diagnose > permissions problems or resource constraints. > > Signed-off-by: Eric Wong <e@80x24.org> > --- > builtin/prune.c | 4 +++- > t/t5304-prune.sh | 8 ++++++++ > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/builtin/prune.c b/builtin/prune.c > index df376b2ed1..2719220108 100644 > --- a/builtin/prune.c > +++ b/builtin/prune.c > @@ -127,7 +127,9 @@ static void remove_temporary_files(const char *path) > > dir = opendir(path); > if (!dir) { > - fprintf(stderr, "Unable to open directory %s\n", path); > + if (errno != ENOENT) > + fprintf(stderr, "Unable to open directory %s: %s\n", > + path, strerror(errno)); We sometimes use fprintf() instead of "error" or "warning" for output compatibility with an older version, or because it's written in an old style. But as you're changing the anyway let's not re-invent error_errno() or warning_errno(), but just use those. We could also s/^Unable/unable/ in the message while at it, per CodingGuidelines. > return; > } > while ((de = readdir(dir)) != NULL) > diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh > index 8ae314af58..d65a5f94b4 100755 > --- a/t/t5304-prune.sh > +++ b/t/t5304-prune.sh > @@ -29,6 +29,14 @@ test_expect_success setup ' > git gc > ' > > +test_expect_success 'bare repo prune is quiet without $GIT_DIR/objects/pack' ' > + git clone -q --shared --template= --bare . bare.git && > + rmdir bare.git/objects/pack && > + git --git-dir=bare.git prune --no-progress 2>prune.err && > + test_must_be_empty prune.err && > + rm -r bare.git prune.err > +' > + > test_expect_success 'prune stale packs' ' > orig_pack=$(echo .git/objects/pack/*.pack) && > >.git/objects/tmp_1.pack && This seems like a good isolated change, but FWIW I think what we really should be doing here is using the "report_garbage" facility added in 543c5caa6c9 (count-objects: report garbage files in pack directory too, 2013-02-15) and 478f34d2b6e (gc: remove garbage .idx files from pack dir, 2015-11-03) for "pack". I.e. we have already iterated over "pack" and found all the files therein, and in packfile.c error_errno() etc. That we're re-opendir()-ing the "pack", walking it again etc. doesn't make much sense, or does it? Then the: remove_temporary_files(get_object_directory()); Also seems odd, just a few lines above we passed "prune_cruft" to "for_each_loose_file_in_objdir()", haven't we already walked the loose object dir & removed temporary cruft there?
Eric Wong <e@80x24.org> writes: > Good question, perhaps this could be a followup: > > diff --git a/builtin/prune.c b/builtin/prune.c > index 2719220108..041c45ecbe 100644 > --- a/builtin/prune.c > +++ b/builtin/prune.c > @@ -188,7 +188,6 @@ int cmd_prune(int argc, const char **argv, const char *prefix) > prune_cruft, prune_subdir, &revs); > > prune_packed_objects(show_only ? PRUNE_PACKED_DRY_RUN : 0); > - remove_temporary_files(get_object_directory()); > s = mkpathdup("%s/pack", get_object_directory()); > remove_temporary_files(s); > free(s); I actually was hinting at making the remove_temporary_files() recurse, so that you do not need the separate invocation in pack/ subdirectory. Or make 256 calls for each of the fan-out subdirectory, in which case the ENOENT silencing you did would really matter and shine.
Junio C Hamano <gitster@pobox.com> writes: >> prune_packed_objects(show_only ? PRUNE_PACKED_DRY_RUN : 0); >> - remove_temporary_files(get_object_directory()); >> s = mkpathdup("%s/pack", get_object_directory()); >> remove_temporary_files(s); >> free(s); > > I actually was hinting at making the remove_temporary_files() > recurse, so that you do not need the separate invocation in pack/ > subdirectory. > > Or make 256 calls for each of the fan-out subdirectory, in which > case the ENOENT silencing you did would really matter and shine. But of course, neither is any part of this topic. They are possible follow-on works. Thanks and sorry for making a confusing statement that could be mistaken as "let's do this too", which wasn't what I meant.
diff --git a/builtin/prune.c b/builtin/prune.c index df376b2ed1..2719220108 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -127,7 +127,9 @@ static void remove_temporary_files(const char *path) dir = opendir(path); if (!dir) { - fprintf(stderr, "Unable to open directory %s\n", path); + if (errno != ENOENT) + fprintf(stderr, "Unable to open directory %s: %s\n", + path, strerror(errno)); return; } while ((de = readdir(dir)) != NULL) diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh index 8ae314af58..d65a5f94b4 100755 --- a/t/t5304-prune.sh +++ b/t/t5304-prune.sh @@ -29,6 +29,14 @@ test_expect_success setup ' git gc ' +test_expect_success 'bare repo prune is quiet without $GIT_DIR/objects/pack' ' + git clone -q --shared --template= --bare . bare.git && + rmdir bare.git/objects/pack && + git --git-dir=bare.git prune --no-progress 2>prune.err && + test_must_be_empty prune.err && + rm -r bare.git prune.err +' + test_expect_success 'prune stale packs' ' orig_pack=$(echo .git/objects/pack/*.pack) && >.git/objects/tmp_1.pack &&
$GIT_DIR/objects/pack may be removed to save inodes in shared repositories. Quiet down prune in cases where either $GIT_DIR/objects or $GIT_DIR/objects/pack is non-existent, but emit the system error in other cases to help users diagnose permissions problems or resource constraints. Signed-off-by: Eric Wong <e@80x24.org> --- builtin/prune.c | 4 +++- t/t5304-prune.sh | 8 ++++++++ 2 files changed, 11 insertions(+), 1 deletion(-)