diff mbox series

[2/2] server-info.c: remove temporary info files on exit

Message ID 2d5a0536af1a6d45835622e2c020266079fa0873.1717712358.git.me@ttaylorr.com (mailing list archive)
State Accepted
Commit 8981dca8bc717d7656f28fc375b513b91b365360
Headers show
Series commit-graph/server-info: use tempfile.h in more places | expand

Commit Message

Taylor Blau June 6, 2024, 10:19 p.m. UTC
The update_info_file() function within server-info.c is responsible for
moving the info/refs and info/packs files around when updating server
info.

These updates are staged into a temporary file and then moved into place
atomically to avoid race conditions when reading those files. However,
the temporary file used to stage these changes is managed outside of the
tempfile.h API, and thus survives process death.

Manage these files instead with the tempfile.h API so that they are
automatically cleaned up upon abnormal process death.

Unfortunately, and unlike in the previous step, there isn't a
straightforward way to inject a failure into the update-server-info step
that causes us to die() rather than take the cleanup path in label
'out', hence the lack of a test here.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 server-info.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

Comments

Junio C Hamano June 7, 2024, 4:02 p.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> @@ -121,27 +120,22 @@ static int update_info_file(char *path,
>  	}
>  
>  	uic.cur_fp = NULL;
> -	if (fclose(to_close))
> -		goto out;

We should fflush() of cur_fp before nuking it, at least, no?

In the original code, to_close was a mere copy of uic.cur_fp and we
made sure that anything buffered at the stdio layer are flushed to
the underlying file desciptor (fd that we obtained from
git_mkstemp_mode() in the original code) with this fclose() call.

We no longer do so.  We later call rename_tempfile() to close the
underlying file descriptor and move the temporary file to its final
place, but I do not see what guarantee we have that we do not lose
what we had buffered in the stdio with the updated code.

>  	if (uic_is_stale(&uic)) {
> -		if (adjust_shared_perm(tmp) < 0)
> +		if (adjust_shared_perm(get_tempfile_path(f)) < 0)
>  			goto out;
> -		if (rename(tmp, path) < 0)
> +		if (rename_tempfile(&f, path) < 0)
>  			goto out;
>  	} else {
> -		unlink(tmp);
> +		delete_tempfile(&f);
>  	}
>  	ret = 0;
>  
>  out:
>  	if (ret) {
>  		error_errno("unable to update %s", path);
> -		if (uic.cur_fp)
> -			fclose(uic.cur_fp);
> -		else if (fd >= 0)
> -			close(fd);
> -		unlink(tmp);
> +		if (f)
> +			delete_tempfile(&f);
>  	}
>  	free(tmp);
>  	if (uic.old_fp)
Taylor Blau June 7, 2024, 9:44 p.m. UTC | #2
On Fri, Jun 07, 2024 at 09:02:14AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > @@ -121,27 +120,22 @@ static int update_info_file(char *path,
> >  	}
> >
> >  	uic.cur_fp = NULL;
> > -	if (fclose(to_close))
> > -		goto out;
>
> We should fflush() of cur_fp before nuking it, at least, no?
>
> In the original code, to_close was a mere copy of uic.cur_fp and we
> made sure that anything buffered at the stdio layer are flushed to
> the underlying file desciptor (fd that we obtained from
> git_mkstemp_mode() in the original code) with this fclose() call.
>
> We no longer do so.  We later call rename_tempfile() to close the
> underlying file descriptor and move the temporary file to its final
> place, but I do not see what guarantee we have that we do not lose
> what we had buffered in the stdio with the updated code.

rename_tempfile() first calls close_tempfile_gently() before calling
rename(). close_tempfile_gently() calls fclose() on temp->fp before
returns, so we get our fflush() call implicitly there.

IOW, the tempfile.h API is happy for us to call rename_tempfile()
without having explicitly closed or flushed the underlying file pointer.

Thanks,
Taylor
Junio C Hamano June 7, 2024, 9:49 p.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

>> We no longer do so.  We later call rename_tempfile() to close the
>> underlying file descriptor and move the temporary file to its final
>> place, but I do not see what guarantee we have that we do not lose
>> what we had buffered in the stdio with the updated code.
>
> rename_tempfile() first calls close_tempfile_gently() before calling
> rename(). close_tempfile_gently() calls fclose() on temp->fp before
> returns, so we get our fflush() call implicitly there.

OK.  I didn't see that fclose().  Then we are safe.

Thanks.
Jeff King June 8, 2024, 10:25 a.m. UTC | #4
On Thu, Jun 06, 2024 at 06:19:31PM -0400, Taylor Blau wrote:

> The update_info_file() function within server-info.c is responsible for
> moving the info/refs and info/packs files around when updating server
> info.
> 
> These updates are staged into a temporary file and then moved into place
> atomically to avoid race conditions when reading those files. However,
> the temporary file used to stage these changes is managed outside of the
> tempfile.h API, and thus survives process death.
> 
> Manage these files instead with the tempfile.h API so that they are
> automatically cleaned up upon abnormal process death.

Makes sense. I was going to suggest that these could even be lockfiles,
but it is intentional to let two simultaneous processes race (with an
atomic last-one-wins result). See d38379ece9 (make update-server-info
more robust, 2014-09-13).

> Unfortunately, and unlike in the previous step, there isn't a
> straightforward way to inject a failure into the update-server-info step
> that causes us to die() rather than take the cleanup path in label
> 'out', hence the lack of a test here.

That sounds like a challenge. ;)

  $ echo garbage >.git/packed-refs
  $ git update-server-info
  fatal: unexpected line in .git/packed-refs: garbage
  $ ls .git/info/
  exclude  refs  refs_QYvQGb

I don't know if it's worth adding such a test. It seems rather brittle
to assume that we'd die() here (let alone that we are using the files
backend at all).

> @@ -86,13 +86,12 @@ static int update_info_file(char *path,
>  	};
>  
>  	safe_create_leading_directories(path);
> -	fd = git_mkstemp_mode(tmp, 0666);
> -	if (fd < 0)
> +	f = mks_tempfile_m(tmp, 0666);
> +	if (!f)
>  		goto out;
> -	to_close = uic.cur_fp = fdopen(fd, "w");
> +	uic.cur_fp = fdopen_tempfile(f, "w");

OK, good, fdopen_tempfile() means that the FILE handle is owned by the
tempfile, too.

> @@ -121,27 +120,22 @@ static int update_info_file(char *path,
>  	}
>  
>  	uic.cur_fp = NULL;
> -	if (fclose(to_close))
> -		goto out;

And we don't need to fclose() anymore since the tempfile code handles
that for us. Nice.

>  	if (uic_is_stale(&uic)) {
> -		if (adjust_shared_perm(tmp) < 0)
> +		if (adjust_shared_perm(get_tempfile_path(f)) < 0)
>  			goto out;
> -		if (rename(tmp, path) < 0)
> +		if (rename_tempfile(&f, path) < 0)
>  			goto out;
>  	} else {
> -		unlink(tmp);
> +		delete_tempfile(&f);
>  	}
>  	ret = 0;

OK, so we always rename or delete here, unless we jumped to the error
path...

>  out:
>  	if (ret) {
>  		error_errno("unable to update %s", path);
> -		if (uic.cur_fp)
> -			fclose(uic.cur_fp);
> -		else if (fd >= 0)
> -			close(fd);
> -		unlink(tmp);
> +		if (f)
> +			delete_tempfile(&f);
>  	}

And here we do an explicit delete, which is good for a lib-ified world
where the process doesn't just exit immediately.

I think you could actually call delete_tempfile() unconditionally, even
outside the "if (ret)" block. It is a noop for a NULL tempfile (so OK
even if we jump to "out" before opening it). And a renamed tempfile goes
back to NULL as well.

I.e., one of the advantages to using the tempfile interface is that it's
always in a consistent state, and you just use delete() on exit, like we
do strbuf_release().

That said, it's a pretty minor style question, and I don't think is
worth a re-roll.

-Peff
diff mbox series

Patch

diff --git a/server-info.c b/server-info.c
index 6feaa457c5c..37d1085982e 100644
--- a/server-info.c
+++ b/server-info.c
@@ -13,6 +13,7 @@ 
 #include "object-store-ll.h"
 #include "server-info.h"
 #include "strbuf.h"
+#include "tempfile.h"
 
 struct update_info_ctx {
 	FILE *cur_fp;
@@ -75,9 +76,8 @@  static int update_info_file(char *path,
 			int force)
 {
 	char *tmp = mkpathdup("%s_XXXXXX", path);
+	struct tempfile *f = NULL;
 	int ret = -1;
-	int fd = -1;
-	FILE *to_close;
 	struct update_info_ctx uic = {
 		.cur_fp = NULL,
 		.old_fp = NULL,
@@ -86,13 +86,12 @@  static int update_info_file(char *path,
 	};
 
 	safe_create_leading_directories(path);
-	fd = git_mkstemp_mode(tmp, 0666);
-	if (fd < 0)
+	f = mks_tempfile_m(tmp, 0666);
+	if (!f)
 		goto out;
-	to_close = uic.cur_fp = fdopen(fd, "w");
+	uic.cur_fp = fdopen_tempfile(f, "w");
 	if (!uic.cur_fp)
 		goto out;
-	fd = -1;
 
 	/* no problem on ENOENT and old_fp == NULL, it's stale, now */
 	if (!force)
@@ -121,27 +120,22 @@  static int update_info_file(char *path,
 	}
 
 	uic.cur_fp = NULL;
-	if (fclose(to_close))
-		goto out;
 
 	if (uic_is_stale(&uic)) {
-		if (adjust_shared_perm(tmp) < 0)
+		if (adjust_shared_perm(get_tempfile_path(f)) < 0)
 			goto out;
-		if (rename(tmp, path) < 0)
+		if (rename_tempfile(&f, path) < 0)
 			goto out;
 	} else {
-		unlink(tmp);
+		delete_tempfile(&f);
 	}
 	ret = 0;
 
 out:
 	if (ret) {
 		error_errno("unable to update %s", path);
-		if (uic.cur_fp)
-			fclose(uic.cur_fp);
-		else if (fd >= 0)
-			close(fd);
-		unlink(tmp);
+		if (f)
+			delete_tempfile(&f);
 	}
 	free(tmp);
 	if (uic.old_fp)