Message ID | cover.1730235646.git.jonathantanmy@google.com (mailing list archive) |
---|---|
Headers | show |
Series | When fetching, warn if in commit graph but not obj db | expand |
On 2024.10.29 14:11, Jonathan Tan wrote: > I mentioned previously [1] the possibility of not running maintenance > steps (commit graph writing and "git maintenance") if no packs were > fetched, but looking at things again, I think that we shouldn't do > that - in particular, if I ran "git fetch --refetch", I would fully > expect the objects to be repacked, even if Git wasn't able to detect > conclusively whether a pack was transmitted. > > [1] https://lore.kernel.org/git/20241028225504.4151804-1-jonathantanmy@google.com/ A note for upstream, because I'm not sure it was ever explicitly mentioned: at $DAYJOB, we saw this fetch recursion error as a side-effect of the erroneous GC of local commits discussed at [2]. [2] https://lore.kernel.org/git/cover.1729792911.git.jonathantanmy@google.com/ > So I went back to my original idea of detecting when an object is > missing. In trying to balance the concerns of both doing something as > reasonable as possible in such a repo corruption case, and not slowing > down and/or unnecessarily complicating the main code flow, I decided > to detect when an object is present in the commit graph but not in the > object DB, and to limit this detection for objects specified in the > fetch refspec. > > Upon detection, we can't fix it due to reasons mentioned in the commit > message, so I decided to print a warning. An alternate option is to make > it a fatal error (instead of a warning) if an object is detected to be > in the commit graph but not the object DB. I haven't thought through the > ramifications of that, though. At first glance, I lean towards making this a fatal error, but I'll try thinking out loud a bit: First, we believe that [2] above should fix the root cause of the particular case we saw at $DAYJOB (hopefully this type of error doesn't have multiple root causes). So we expect to basically never encounter this error again after [2] is merged and rolled out, and all existing cases of repo corruption have been repaired. However, interacting with a broken repo even with a client that includes [2] would still hit this condition and issue a warning. With the current implementation, fetching in a corrupt repo would still cause git-fetch to infinitely recurse, and therefore would repeatedly print the same error message to the console, until either the user noticed, or we fail to launch a new git-fetch process due to resource exhaustion. I don't see any reason why the above situation is more friendly or desirable than exiting (with the same error message) as soon as we detect this type of corruption. However, I don't feel super strongly about it. If the rest of the list is OK with repeated error messages, then I can live with it. > Jonathan Tan (2): > Revert "fetch-pack: add a deref_without_lazy_fetch_extended()" > fetch-pack: warn if in commit graph but not obj db > > fetch-pack.c | 45 +++++++++++++++++++++++++-------------------- > object.h | 2 +- > 2 files changed, 26 insertions(+), 21 deletions(-) > > -- > 2.47.0.163.g1226f6d8fa-goog > >