Message ID | pull.873.git.1612973499110.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 899034efca9cd17115a648045385846231e6c0dc |
Headers | show |
Series | reflog expire --stale-fix: be generous about missing objects | expand |
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > Whenever a user runs `git reflog expire --stale-fix`, the most likely > reason is that their repository is at least _somewhat_ corrupt. Which > means that it is more than just possible that some objects are missing. > > If that is the case, that can currently let the command abort through > the phase where it tries to mark all reachable objects. > > Instead of adding insult to injury, let's be gentle and continue as best > as we can in such a scenario, simply by ignoring the missing objects and > moving on. > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > --- I am of two minds. I appreciate an effort of making it looser and less likely to get in trouble when running in a corrupt repository, but --stale-fix is a bit special and should probably be removed. The only reason the option was added was because we needed to have an easy way to recover from a specific kind of reflog corruption that would have resulted by a (then known) bug in "git reflog" in the released version of Git that came immediately before the version of Git that added the "fix" option, while the root cause of the corruption got fixed. Back when 1389d9dd (reflog expire --fix-stale, 2007-01-06) was written, it was very useful to have a way to recover from the corruption likely to have happened with the version of Git that came before it. But it no longer is relevant after this many years. There may be other ways to break the reflog entries, but --stale-fix was never designed to deal with anything but a specific way the reflog gets corrupted (namely, by the old version of Git that corrupted reflog in a specific way), so keeping it would not be very useful.
On Wed, Feb 10, 2021 at 12:30:27PM -0800, Junio C Hamano wrote: > I appreciate an effort of making it looser and less likely to get in > trouble when running in a corrupt repository, but --stale-fix is a > bit special and should probably be removed. > > The only reason the option was added was because we needed to have > an easy way to recover from a specific kind of reflog corruption > that would have resulted by a (then known) bug in "git reflog" in > the released version of Git that came immediately before the version > of Git that added the "fix" option, while the root cause of the > corruption got fixed. > > Back when 1389d9dd (reflog expire --fix-stale, 2007-01-06) was > written, it was very useful to have a way to recover from the > corruption likely to have happened with the version of Git that came > before it. But it no longer is relevant after this many years. > There may be other ways to break the reflog entries, but --stale-fix > was never designed to deal with anything but a specific way the > reflog gets corrupted (namely, by the old version of Git that > corrupted reflog in a specific way), so keeping it would not be very > useful. FWIW, I have used --stale-fix for cases where a repo's reflog was "corrupted" by its alternate pruning objects. E.g., if you do something like this: git clone -s orig.git new.git you're now at risk of "git gc" in orig.git corrupting new.git, because its reachability check doesn't know anything about those refs. You can mitigate that by keeping a copy of new.git's refs in orig.git. Something like: git -C orig.git fetch ../new.git refs/*:refs/preserve/* git -C orig.git gc But that doesn't know about reflogs at all! It will still corrupt them (but only sometimes; depending how often you do that fetch, you might catch the same values in orig.git's reflog). Possibly the correct answer here is "turn off reflogs in new.git", but they are sometimes useful, and things _mostly_ work (for history that doesn't rewind, or where the rewound bits are specific to new.git). So it's useful to be able to run something like "reflog expire --stale-fix" to clear out the occasional problem. (A careful reader will note that objects mentioned only in the index have a similar problem; but again, those tend to be local to new.git, and don't exist at all in a server context). -Peff
Hi, On Thu, 11 Feb 2021, Jeff King wrote: > On Wed, Feb 10, 2021 at 12:30:27PM -0800, Junio C Hamano wrote: > > > I appreciate an effort of making it looser and less likely to get in > > trouble when running in a corrupt repository, but --stale-fix is a > > bit special and should probably be removed. > > > > The only reason the option was added was because we needed to have > > an easy way to recover from a specific kind of reflog corruption > > that would have resulted by a (then known) bug in "git reflog" in > > the released version of Git that came immediately before the version > > of Git that added the "fix" option, while the root cause of the > > corruption got fixed. > > > > Back when 1389d9dd (reflog expire --fix-stale, 2007-01-06) was > > written, it was very useful to have a way to recover from the > > corruption likely to have happened with the version of Git that came > > before it. But it no longer is relevant after this many years. > > There may be other ways to break the reflog entries, but --stale-fix > > was never designed to deal with anything but a specific way the > > reflog gets corrupted (namely, by the old version of Git that > > corrupted reflog in a specific way), so keeping it would not be very > > useful. Thank you for the original historical context. > FWIW, I have used --stale-fix for cases where a repo's reflog was > "corrupted" by its alternate pruning objects. > > E.g., if you do something like this: > > git clone -s orig.git new.git > > you're now at risk of "git gc" in orig.git corrupting new.git, because > its reachability check doesn't know anything about those refs. You can > mitigate that by keeping a copy of new.git's refs in orig.git. Something > like: > > git -C orig.git fetch ../new.git refs/*:refs/preserve/* > git -C orig.git gc > > But that doesn't know about reflogs at all! It will still corrupt them > (but only sometimes; depending how often you do that fetch, you might > catch the same values in orig.git's reflog). > > Possibly the correct answer here is "turn off reflogs in new.git", but > they are sometimes useful, and things _mostly_ work (for history that > doesn't rewind, or where the rewound bits are specific to new.git). So > it's useful to be able to run something like "reflog expire --stale-fix" > to clear out the occasional problem. > > (A careful reader will note that objects mentioned only in the index > have a similar problem; but again, those tend to be local to new.git, > and don't exist at all in a server context). I want to add the experience with that half year during which `git gc` with worktrees was broken, as it would happily ignore the reflogs of the worktree-specific `HEAD`s, all except the one from the worktree from which `git gc` was run. That was some fun time, and `--stale-fix` was a life saver. With this _additional_ historical context, I would deem it wise to keep `--stale-fix` around. Ciao, Dscho
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> Possibly the correct answer here is "turn off reflogs in new.git", but >> they are sometimes useful, and things _mostly_ work (for history that >> doesn't rewind, or where the rewound bits are specific to new.git). So >> it's useful to be able to run something like "reflog expire --stale-fix" >> to clear out the occasional problem. >> >> (A careful reader will note that objects mentioned only in the index >> have a similar problem; but again, those tend to be local to new.git, >> and don't exist at all in a server context). > > I want to add the experience with that half year during which `git gc` > with worktrees was broken, as it would happily ignore the reflogs of the > worktree-specific `HEAD`s, all except the one from the worktree from which > `git gc` was run. That was some fun time, and `--stale-fix` was a life > saver. The option was invented for a specific case, but if its "fix" applies for breakages caused by more recent bugs and user induced actions, I would agree that that it gives us a good reason to keep it around. I have to wonder if the explanation in the documentation for the option needs to be made less specific to the originally motivating case, though. Thanks.
diff --git a/builtin/reflog.c b/builtin/reflog.c index ca1d8079f320..09541d1c8048 100644 --- a/builtin/reflog.c +++ b/builtin/reflog.c @@ -602,6 +602,9 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix) */ if (cb.cmd.stalefix) { repo_init_revisions(the_repository, &cb.cmd.revs, prefix); + cb.cmd.revs.do_not_die_on_missing_tree = 1; + cb.cmd.revs.ignore_missing = 1; + cb.cmd.revs.ignore_missing_links = 1; if (flags & EXPIRE_REFLOGS_VERBOSE) printf(_("Marking reachable objects...")); mark_reachable_objects(&cb.cmd.revs, 0, 0, NULL); diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh index ecccaa063453..27b9080251a9 100755 --- a/t/t1410-reflog.sh +++ b/t/t1410-reflog.sh @@ -158,6 +158,32 @@ test_expect_success 'reflog expire' ' check_fsck "dangling commit $K" ' +test_expect_success '--stale-fix handles missing objects generously' ' + git -c core.logAllRefUpdates=false fast-import --date-format=now <<-EOS && + commit refs/heads/stale-fix + mark :1 + committer Author <a@uth.or> now + data <<EOF + start stale fix + EOF + M 100644 inline file + data <<EOF + contents + EOF + commit refs/heads/stale-fix + committer Author <a@uth.or> now + data <<EOF + stale fix branch tip + EOF + from :1 + EOS + + parent_oid=$(git rev-parse stale-fix^) && + test_when_finished "recover $parent_oid" && + corrupt $parent_oid && + git reflog expire --stale-fix +' + test_expect_success 'prune and fsck' ' git prune &&