mbox series

[v2,0/3] object-file: retry linking file into place when occluding file vanishes

Message ID 20250106-b4-pks-object-file-racy-collision-check-v2-0-8b3984ecbb18@pks.im (mailing list archive)
Headers show
Series object-file: retry linking file into place when occluding file vanishes | expand

Message

Patrick Steinhardt Jan. 6, 2025, 9:24 a.m. UTC
Hi,

this small patch series adapts the race fix for collision checks when
moving object files into place [1] to retry linking the object into
place instead of silently ignoring the error, as suggested by Peff in
[2].

The series at [1] has already been merged into 'next', so this is built
on top of 1b4e9a5f8b (Merge branch 'ps/build-meson-html', 2025-01-02)
with ps/object-collision-check at 0ad3d65652 (object-file: fix race in
object collision check, 2024-12-30) merged into it.

Changes in v2:

  - Add retry counter so that we eventually abort in case the colliding
    files repeatedly reappears and vanishes again.
  - Evict the change to stop handling ENOENT for the source file specially
    into a separate commit.
  - Commit message improvements.
  - Link to v1: https://lore.kernel.org/r/20250103-b4-pks-object-file-racy-collision-check-v1-0-6ef9e2da1f87@pks.im

Thanks!

Patrick

[1]: <20241230-b4-pks-object-file-racy-collision-check-v1-1-11571294e60a@pks.im>
[2]: <20241231014220.GA225521@coredump.intra.peff.net>

---
Patrick Steinhardt (3):
      object-file: rename variables in `check_collision()`
      object-file: don't special-case missing source file in collision check
      object-file: retry linking file into place when occluding file vanishes

 object-file.c | 66 +++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 41 insertions(+), 25 deletions(-)

Range-diff versus v1:

1:  389b9e4180 = 1:  0c38089e9f object-file: rename variables in `check_collision()`
-:  ---------- > 2:  d4a7ec11c8 object-file: don't special-case missing source file in collision check
2:  91a8bdf37d ! 3:  587fd01fb6 object-file: retry linking file into place when occluding file vanishes
    @@ Metadata
      ## Commit message ##
         object-file: retry linking file into place when occluding file vanishes
     
    -    In a preceding commit, we have adapted `check_collision()` to ignore
    -    the case where either of the colliding files vanishes. This should be
    -    safe in general when we assume that the contents of these two files were
    -    the same. But the check is all about detecting collisions, so that
    -    assumption may be too optimistic.
    +    Prior to 0ad3d65652 (object-file: fix race in object collision check,
    +    2024-12-30), callers could expect that a successful return from
    +    `finalize_object_file()` means that either the file was moved into
    +    place, or the identical bytes were already present. If neither of those
    +    happens, we'd return an error.
     
    -    Adapt the code to retry linking the object into place when we see that
    -    the destination file has racily vanished. This should generally succeed
    -    as we have just observed that the destination file does not exist
    -    anymore, except in the very unlikely event that it gets recreated by
    -    another concurrent process again.
    +    Since that commit, if the destination file disappears between our
    +    link(3p) call and the collision check, we'd return success without
    +    actually checking the contents, and without retrying the link. This
    +    solves the common case that the files were indeed the same, but it means
    +    that we may corrupt the repository if they weren't (this implies a hash
    +    collision, but the whole point of this function is protecting against
    +    hash collisions).
     
    -    Furthermore, stop treating `ENOENT` specially for the source file. It
    -    shouldn't happen that the source vanishes as we're using a fresh
    -    temporary file for it, so if it does vanish it indicates an actual
    -    error.
    +    We can't be pessimistic and assume they're different; that hurts the
    +    common case that the mentioned commit was trying to fix. But after
    +    seeing that the destination file went away, we can retry linking again.
    +    Adapt the code to do so when we see that the destination file has racily
    +    vanished. This should generally succeed as we have just observed that
    +    the destination file does not exist anymore, except in the very unlikely
    +    event that it gets recreated by another concurrent process again.
     
    -    Suggested-by: Jeff King <peff@peff.net>
    +    Helped-by: Jeff King <peff@peff.net>
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## object-file.c ##
    @@ object-file.c: static void write_object_file_prepare_literally(const struct git_
      static int check_collision(const char *source, const char *dest)
      {
      	char buf_source[4096], buf_dest[4096];
    -@@ object-file.c: static int check_collision(const char *source, const char *dest)
    - 
    - 	fd_source = open(source, O_RDONLY);
    - 	if (fd_source < 0) {
    --		if (errno != ENOENT)
    --			ret = error_errno(_("unable to open %s"), source);
    -+		ret = error_errno(_("unable to open %s"), source);
    - 		goto out;
    - 	}
    - 
     @@ object-file.c: static int check_collision(const char *source, const char *dest)
      	if (fd_dest < 0) {
      		if (errno != ENOENT)
    @@ object-file.c: int finalize_object_file(const char *tmpfile, const char *filenam
      {
     -	struct stat st;
     -	int ret = 0;
    ++	unsigned retries = 0;
     +	int ret;
     +
     +retry:
    @@ object-file.c: int finalize_object_file_flags(const char *tmpfile, const char *f
     -		    check_collision(tmpfile, filename))
     +		if (!(flags & FOF_SKIP_COLLISION_CHECK)) {
     +			ret = check_collision(tmpfile, filename);
    -+			if (ret == CHECK_COLLISION_DEST_VANISHED)
    ++			if (ret == CHECK_COLLISION_DEST_VANISHED) {
    ++				if (retries++ > 5)
    ++					return error(_("unable to write repeatedly vanishing file %s"),
    ++						     filename);
     +				goto retry;
    ++			}
     +			else if (ret)
      				return -1;
     +		}

---
base-commit: 2be278337fd02495a86577a89fbf9387b2df6523
change-id: 20250103-b4-pks-object-file-racy-collision-check-a649bbf96a92