Message ID | 25324fea5b7c7f748d7f4e1e40299c0af04006e8.1717712358.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Accepted |
Commit | c195ecda7703098f28c160cc63306ca0e1488236 |
Headers | show |
Series | commit-graph/server-info: use tempfile.h in more places | expand |
Taylor Blau <me@ttaylorr.com> writes: > @@ -2133,8 +2132,6 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) > char *final_graph_name; > int result; > > - close(fd); > - > if (!chainf) { > error(_("unable to open commit-graph chain file")); > return -1; > @@ -2169,7 +2166,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) > free(ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 1]); > ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 1] = final_graph_name; > > - result = rename(ctx->graph_name, final_graph_name); > + result = rename_tempfile(&graph_layer, final_graph_name); Before this rename, after the close(fd) we saw in the previous hunk, there is one early error return when we fail to rename the base graph file. Do we need to do anything there, or an unfinished tempfile getting removed at the process termination is sufficient for cleaning up the mess? Other than that, this looked quite straight-forward. Thanks.
On Fri, Jun 07, 2024 at 09:33:56AM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > @@ -2133,8 +2132,6 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) > > char *final_graph_name; > > int result; > > > > - close(fd); > > - > > if (!chainf) { > > error(_("unable to open commit-graph chain file")); > > return -1; > > @@ -2169,7 +2166,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) > > free(ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 1]); > > ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 1] = final_graph_name; > > > > - result = rename(ctx->graph_name, final_graph_name); > > + result = rename_tempfile(&graph_layer, final_graph_name); > > Before this rename, after the close(fd) we saw in the previous hunk, > there is one early error return when we fail to rename the base > graph file. Do we need to do anything there, or an unfinished > tempfile getting removed at the process termination is sufficient > for cleaning up the mess? We could explicitly clean it up, but we'll do so implicitly upon exit, so I think it's fine to leave it as-is. Thanks, Taylor
Taylor Blau <me@ttaylorr.com> writes: >> > - result = rename(ctx->graph_name, final_graph_name); >> > + result = rename_tempfile(&graph_layer, final_graph_name); >> >> Before this rename, after the close(fd) we saw in the previous hunk, >> there is one early error return when we fail to rename the base >> graph file. Do we need to do anything there, or an unfinished >> tempfile getting removed at the process termination is sufficient >> for cleaning up the mess? > > We could explicitly clean it up, but we'll do so implicitly upon exit, > so I think it's fine to leave it as-is. I am not worried about cleaning it up. Upon exit, the underlying file descriptor will be closed, but this new code never does fclose() on the FILE* that has a buffer around the underlying file descriptor. How are we guaranteeing that we are not losing anything buffered but not flushed yet?
Taylor Blau <me@ttaylorr.com> writes: >> Before this rename, after the close(fd) we saw in the previous hunk, >> there is one early error return when we fail to rename the base >> graph file. Do we need to do anything there, or an unfinished >> tempfile getting removed at the process termination is sufficient >> for cleaning up the mess? > > We could explicitly clean it up, but we'll do so implicitly upon exit, > so I think it's fine to leave it as-is. OK. That sounds good.
On Fri, Jun 07, 2024 at 02:47:58PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > >> > - result = rename(ctx->graph_name, final_graph_name); > >> > + result = rename_tempfile(&graph_layer, final_graph_name); > >> > >> Before this rename, after the close(fd) we saw in the previous hunk, > >> there is one early error return when we fail to rename the base > >> graph file. Do we need to do anything there, or an unfinished > >> tempfile getting removed at the process termination is sufficient > >> for cleaning up the mess? > > > > We could explicitly clean it up, but we'll do so implicitly upon exit, > > so I think it's fine to leave it as-is. > > I am not worried about cleaning it up. Upon exit, the underlying > file descriptor will be closed, but this new code never does > fclose() on the FILE* that has a buffer around the underlying file > descriptor. How are we guaranteeing that we are not losing anything > buffered but not flushed yet? I'm not sure I understand. There is no "FILE *" at all in this patch (we use the descriptor directly via the hashfd interface). But even if there were, if we leave without renaming the temp file into place, then that tempfile will be deleted. So it wouldn't matter if we flushed to it or not, because all of those byte are destined to be thrown away. -Peff
On Thu, Jun 06, 2024 at 06:19:25PM -0400, Taylor Blau wrote: > @@ -2035,24 +2035,23 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) > LOCK_DIE_ON_ERROR, 0444); > free(lock_name); > > - fd = git_mkstemp_mode(ctx->graph_name, 0444); > - if (fd < 0) { > + graph_layer = mks_tempfile_m(ctx->graph_name, 0444); > + if (!graph_layer) { > error(_("unable to create temporary graph layer")); > return -1; > } > > - if (adjust_shared_perm(ctx->graph_name)) { > + if (adjust_shared_perm(get_tempfile_path(graph_layer))) { > error(_("unable to adjust shared permissions for '%s'"), > - ctx->graph_name); > + get_tempfile_path(graph_layer)); > return -1; > } Most errors will cause us to die(), but this "return" and the one that Junio noted later (when renaming the base graph file fails) mean we'll return with the tempfile still active. We'll clean it up when the program exits, but if there were a long-running program that called write_commit_graph_file(), that tempfile might hang around longer than necessary. But that is the same strategy that the existing code uses for the lock we use for the chain filename! And it is much worse there, because now it is not just a tempfile hanging around, but we're blocking anybody else from taking the lock. So I think it would be OK to punt on this for now. Your patch is not making the situation worse, and it's all a problem for a hypothetical libification of this function. > +test_expect_success 'temporary graph layer is discarded upon failure' ' > + git init layer-discard && > + ( > + cd layer-discard && > + > + test_commit A && > + test_commit B && > + > + # Intentionally remove commit "A" from the object store > + # so that the commit-graph machinery fails to parse the > + # parents of "B". > + # > + # This takes place after the commit-graph machinery has > + # initialized a new temporary file to store the contents > + # of the new graph layer, so will allow us to ensure > + # that the temporary file is discarded upon failure. > + rm $objdir/$(test_oid_to_path $(git rev-parse HEAD^)) && > + > + test_must_fail git commit-graph write --reachable --split && > + test_dir_is_empty $graphdir > + ) > +' I'm glad you were able to come up with a case that fails cleanly and non-racily. The exit code of rev-parse will be lost. I doubt that it matters in practice, but I wouldn't be surprised if our shell linting eventually learns to complain about this spot. Looks like there are a few similar ones in the test suite already, from t5329. I'd be content to leave it for now and deal with it later if somebody really wants to go on a crusade against lost exit codes. -Peff
diff --git a/commit-graph.c b/commit-graph.c index e5dd3553dfe..92c71637142 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -2002,8 +2002,8 @@ static int write_graph_chunk_base(struct hashfile *f, static int write_commit_graph_file(struct write_commit_graph_context *ctx) { uint32_t i; - int fd; struct hashfile *f; + struct tempfile *graph_layer; /* when ctx->split is non-zero */ struct lock_file lk = LOCK_INIT; const unsigned hashsz = the_hash_algo->rawsz; struct strbuf progress_title = STRBUF_INIT; @@ -2035,24 +2035,23 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) LOCK_DIE_ON_ERROR, 0444); free(lock_name); - fd = git_mkstemp_mode(ctx->graph_name, 0444); - if (fd < 0) { + graph_layer = mks_tempfile_m(ctx->graph_name, 0444); + if (!graph_layer) { error(_("unable to create temporary graph layer")); return -1; } - if (adjust_shared_perm(ctx->graph_name)) { + if (adjust_shared_perm(get_tempfile_path(graph_layer))) { error(_("unable to adjust shared permissions for '%s'"), - ctx->graph_name); + get_tempfile_path(graph_layer)); return -1; } - f = hashfd(fd, ctx->graph_name); + f = hashfd(get_tempfile_fd(graph_layer), get_tempfile_path(graph_layer)); } else { hold_lock_file_for_update_mode(&lk, ctx->graph_name, LOCK_DIE_ON_ERROR, 0444); - fd = get_lock_file_fd(&lk); - f = hashfd(fd, get_lock_file_path(&lk)); + f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); } cf = init_chunkfile(f); @@ -2133,8 +2132,6 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) char *final_graph_name; int result; - close(fd); - if (!chainf) { error(_("unable to open commit-graph chain file")); return -1; @@ -2169,7 +2166,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) free(ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 1]); ctx->commit_graph_filenames_after[ctx->num_commit_graphs_after - 1] = final_graph_name; - result = rename(ctx->graph_name, final_graph_name); + result = rename_tempfile(&graph_layer, final_graph_name); for (i = 0; i < ctx->num_commit_graphs_after; i++) fprintf(get_lock_file_fp(&lk), "%s\n", ctx->commit_graph_hash_after[i]); diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh index 281266f7883..77e91547ea3 100755 --- a/t/t5324-split-commit-graph.sh +++ b/t/t5324-split-commit-graph.sh @@ -13,7 +13,8 @@ test_expect_success 'setup repo' ' git init && git config core.commitGraph true && git config gc.writeCommitGraph false && - infodir=".git/objects/info" && + objdir=".git/objects" && + infodir="$objdir/info" && graphdir="$infodir/commit-graphs" && test_oid_cache <<-EOM shallow sha1:2132 @@ -718,4 +719,27 @@ test_expect_success 'write generation data chunk when commit-graph chain is repl ) ' +test_expect_success 'temporary graph layer is discarded upon failure' ' + git init layer-discard && + ( + cd layer-discard && + + test_commit A && + test_commit B && + + # Intentionally remove commit "A" from the object store + # so that the commit-graph machinery fails to parse the + # parents of "B". + # + # This takes place after the commit-graph machinery has + # initialized a new temporary file to store the contents + # of the new graph layer, so will allow us to ensure + # that the temporary file is discarded upon failure. + rm $objdir/$(test_oid_to_path $(git rev-parse HEAD^)) && + + test_must_fail git commit-graph write --reachable --split && + test_dir_is_empty $graphdir + ) +' + test_done
Since the introduction of split commit graph layers in 92b1ea66b9a (Merge branch 'ds/commit-graph-incremental', 2019-07-19), the function write_commit_graph_file() has done the following when writing an incremental commit-graph layer: - used a lock_file to control access to the commit-graph-chain file - used an auxiliary file (whose descriptor was stored in 'fd') to write the new commit-graph layer itself Using a lock_file to control access to the commit-graph-chain is sensible, since only one writer may modify it at a time. Likewise, when the commit-graph machinery is writing out a single layer, the lock_file structure is used to modify the commit-graph itself. This is also sensible, since the non-incremental commit-graph may also have at most one writer. However, using an auxiliary temporary file without using the tempfile.h API means that writes that fail after the temporary graph layer has been created will leave around a file in $GIT_DIR/objects/info/commit-graphs/tmp_graph_XXXXXX The commit-graph-chain file and non-incremental commit-graph do not suffer from this problem as the lockfile.h API uses the tempfile.h API transparently, so processes that died before moving those finals into their final location cleaned up after themselves. Ensure that the temporary file used to write incremental commit-graphs is also managed with the tempfile.h API, to ensure that we do not ever leave tmp_graph_XXXXXX files laying around. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- commit-graph.c | 19 ++++++++----------- t/t5324-split-commit-graph.sh | 26 +++++++++++++++++++++++++- 2 files changed, 33 insertions(+), 12 deletions(-)