Message ID | 2d5a0536af1a6d45835622e2c020266079fa0873.1717712358.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 8981dca8bc717d7656f28fc375b513b91b365360 |
Headers | show |
Series | commit-graph/server-info: use tempfile.h in more places | expand |
Taylor Blau <me@ttaylorr.com> writes: > @@ -121,27 +120,22 @@ static int update_info_file(char *path, > } > > uic.cur_fp = NULL; > - if (fclose(to_close)) > - goto out; We should fflush() of cur_fp before nuking it, at least, no? In the original code, to_close was a mere copy of uic.cur_fp and we made sure that anything buffered at the stdio layer are flushed to the underlying file desciptor (fd that we obtained from git_mkstemp_mode() in the original code) with this fclose() call. We no longer do so. We later call rename_tempfile() to close the underlying file descriptor and move the temporary file to its final place, but I do not see what guarantee we have that we do not lose what we had buffered in the stdio with the updated code. > if (uic_is_stale(&uic)) { > - if (adjust_shared_perm(tmp) < 0) > + if (adjust_shared_perm(get_tempfile_path(f)) < 0) > goto out; > - if (rename(tmp, path) < 0) > + if (rename_tempfile(&f, path) < 0) > goto out; > } else { > - unlink(tmp); > + delete_tempfile(&f); > } > ret = 0; > > out: > if (ret) { > error_errno("unable to update %s", path); > - if (uic.cur_fp) > - fclose(uic.cur_fp); > - else if (fd >= 0) > - close(fd); > - unlink(tmp); > + if (f) > + delete_tempfile(&f); > } > free(tmp); > if (uic.old_fp)
On Fri, Jun 07, 2024 at 09:02:14AM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > @@ -121,27 +120,22 @@ static int update_info_file(char *path, > > } > > > > uic.cur_fp = NULL; > > - if (fclose(to_close)) > > - goto out; > > We should fflush() of cur_fp before nuking it, at least, no? > > In the original code, to_close was a mere copy of uic.cur_fp and we > made sure that anything buffered at the stdio layer are flushed to > the underlying file desciptor (fd that we obtained from > git_mkstemp_mode() in the original code) with this fclose() call. > > We no longer do so. We later call rename_tempfile() to close the > underlying file descriptor and move the temporary file to its final > place, but I do not see what guarantee we have that we do not lose > what we had buffered in the stdio with the updated code. rename_tempfile() first calls close_tempfile_gently() before calling rename(). close_tempfile_gently() calls fclose() on temp->fp before returns, so we get our fflush() call implicitly there. IOW, the tempfile.h API is happy for us to call rename_tempfile() without having explicitly closed or flushed the underlying file pointer. Thanks, Taylor
Taylor Blau <me@ttaylorr.com> writes: >> We no longer do so. We later call rename_tempfile() to close the >> underlying file descriptor and move the temporary file to its final >> place, but I do not see what guarantee we have that we do not lose >> what we had buffered in the stdio with the updated code. > > rename_tempfile() first calls close_tempfile_gently() before calling > rename(). close_tempfile_gently() calls fclose() on temp->fp before > returns, so we get our fflush() call implicitly there. OK. I didn't see that fclose(). Then we are safe. Thanks.
On Thu, Jun 06, 2024 at 06:19:31PM -0400, Taylor Blau wrote: > The update_info_file() function within server-info.c is responsible for > moving the info/refs and info/packs files around when updating server > info. > > These updates are staged into a temporary file and then moved into place > atomically to avoid race conditions when reading those files. However, > the temporary file used to stage these changes is managed outside of the > tempfile.h API, and thus survives process death. > > Manage these files instead with the tempfile.h API so that they are > automatically cleaned up upon abnormal process death. Makes sense. I was going to suggest that these could even be lockfiles, but it is intentional to let two simultaneous processes race (with an atomic last-one-wins result). See d38379ece9 (make update-server-info more robust, 2014-09-13). > Unfortunately, and unlike in the previous step, there isn't a > straightforward way to inject a failure into the update-server-info step > that causes us to die() rather than take the cleanup path in label > 'out', hence the lack of a test here. That sounds like a challenge. ;) $ echo garbage >.git/packed-refs $ git update-server-info fatal: unexpected line in .git/packed-refs: garbage $ ls .git/info/ exclude refs refs_QYvQGb I don't know if it's worth adding such a test. It seems rather brittle to assume that we'd die() here (let alone that we are using the files backend at all). > @@ -86,13 +86,12 @@ static int update_info_file(char *path, > }; > > safe_create_leading_directories(path); > - fd = git_mkstemp_mode(tmp, 0666); > - if (fd < 0) > + f = mks_tempfile_m(tmp, 0666); > + if (!f) > goto out; > - to_close = uic.cur_fp = fdopen(fd, "w"); > + uic.cur_fp = fdopen_tempfile(f, "w"); OK, good, fdopen_tempfile() means that the FILE handle is owned by the tempfile, too. > @@ -121,27 +120,22 @@ static int update_info_file(char *path, > } > > uic.cur_fp = NULL; > - if (fclose(to_close)) > - goto out; And we don't need to fclose() anymore since the tempfile code handles that for us. Nice. > if (uic_is_stale(&uic)) { > - if (adjust_shared_perm(tmp) < 0) > + if (adjust_shared_perm(get_tempfile_path(f)) < 0) > goto out; > - if (rename(tmp, path) < 0) > + if (rename_tempfile(&f, path) < 0) > goto out; > } else { > - unlink(tmp); > + delete_tempfile(&f); > } > ret = 0; OK, so we always rename or delete here, unless we jumped to the error path... > out: > if (ret) { > error_errno("unable to update %s", path); > - if (uic.cur_fp) > - fclose(uic.cur_fp); > - else if (fd >= 0) > - close(fd); > - unlink(tmp); > + if (f) > + delete_tempfile(&f); > } And here we do an explicit delete, which is good for a lib-ified world where the process doesn't just exit immediately. I think you could actually call delete_tempfile() unconditionally, even outside the "if (ret)" block. It is a noop for a NULL tempfile (so OK even if we jump to "out" before opening it). And a renamed tempfile goes back to NULL as well. I.e., one of the advantages to using the tempfile interface is that it's always in a consistent state, and you just use delete() on exit, like we do strbuf_release(). That said, it's a pretty minor style question, and I don't think is worth a re-roll. -Peff
diff --git a/server-info.c b/server-info.c index 6feaa457c5c..37d1085982e 100644 --- a/server-info.c +++ b/server-info.c @@ -13,6 +13,7 @@ #include "object-store-ll.h" #include "server-info.h" #include "strbuf.h" +#include "tempfile.h" struct update_info_ctx { FILE *cur_fp; @@ -75,9 +76,8 @@ static int update_info_file(char *path, int force) { char *tmp = mkpathdup("%s_XXXXXX", path); + struct tempfile *f = NULL; int ret = -1; - int fd = -1; - FILE *to_close; struct update_info_ctx uic = { .cur_fp = NULL, .old_fp = NULL, @@ -86,13 +86,12 @@ static int update_info_file(char *path, }; safe_create_leading_directories(path); - fd = git_mkstemp_mode(tmp, 0666); - if (fd < 0) + f = mks_tempfile_m(tmp, 0666); + if (!f) goto out; - to_close = uic.cur_fp = fdopen(fd, "w"); + uic.cur_fp = fdopen_tempfile(f, "w"); if (!uic.cur_fp) goto out; - fd = -1; /* no problem on ENOENT and old_fp == NULL, it's stale, now */ if (!force) @@ -121,27 +120,22 @@ static int update_info_file(char *path, } uic.cur_fp = NULL; - if (fclose(to_close)) - goto out; if (uic_is_stale(&uic)) { - if (adjust_shared_perm(tmp) < 0) + if (adjust_shared_perm(get_tempfile_path(f)) < 0) goto out; - if (rename(tmp, path) < 0) + if (rename_tempfile(&f, path) < 0) goto out; } else { - unlink(tmp); + delete_tempfile(&f); } ret = 0; out: if (ret) { error_errno("unable to update %s", path); - if (uic.cur_fp) - fclose(uic.cur_fp); - else if (fd >= 0) - close(fd); - unlink(tmp); + if (f) + delete_tempfile(&f); } free(tmp); if (uic.old_fp)
The update_info_file() function within server-info.c is responsible for moving the info/refs and info/packs files around when updating server info. These updates are staged into a temporary file and then moved into place atomically to avoid race conditions when reading those files. However, the temporary file used to stage these changes is managed outside of the tempfile.h API, and thus survives process death. Manage these files instead with the tempfile.h API so that they are automatically cleaned up upon abnormal process death. Unfortunately, and unlike in the previous step, there isn't a straightforward way to inject a failure into the update-server-info step that causes us to die() rather than take the cleanup path in label 'out', hence the lack of a test here. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- server-info.c | 26 ++++++++++---------------- 1 file changed, 10 insertions(+), 16 deletions(-)