mbox series

[0/2] When fetching, warn if in commit graph but not obj db

Message ID cover.1730235646.git.jonathantanmy@google.com (mailing list archive)
Headers show
Series When fetching, warn if in commit graph but not obj db | expand

Message

Jonathan Tan Oct. 29, 2024, 9:11 p.m. UTC
I mentioned previously [1] the possibility of not running maintenance
steps (commit graph writing and "git maintenance") if no packs were
fetched, but looking at things again, I think that we shouldn't do
that - in particular, if I ran "git fetch --refetch", I would fully
expect the objects to be repacked, even if Git wasn't able to detect
conclusively whether a pack was transmitted.

So I went back to my original idea of detecting when an object is
missing. In trying to balance the concerns of both doing something as
reasonable as possible in such a repo corruption case, and not slowing
down and/or unnecessarily complicating the main code flow, I decided
to detect when an object is present in the commit graph but not in the
object DB, and to limit this detection for objects specified in the
fetch refspec.

Upon detection, we can't fix it due to reasons mentioned in the commit
message, so I decided to print a warning. An alternate option is to make
it a fatal error (instead of a warning) if an object is detected to be
in the commit graph but not the object DB. I haven't thought through the
ramifications of that, though.

[1] https://lore.kernel.org/git/20241028225504.4151804-1-jonathantanmy@google.com/

Jonathan Tan (2):
  Revert "fetch-pack: add a deref_without_lazy_fetch_extended()"
  fetch-pack: warn if in commit graph but not obj db

 fetch-pack.c | 45 +++++++++++++++++++++++++--------------------
 object.h     |  2 +-
 2 files changed, 26 insertions(+), 21 deletions(-)

Comments

Josh Steadmon Oct. 30, 2024, 9:22 p.m. UTC | #1
On 2024.10.29 14:11, Jonathan Tan wrote:
> I mentioned previously [1] the possibility of not running maintenance
> steps (commit graph writing and "git maintenance") if no packs were
> fetched, but looking at things again, I think that we shouldn't do
> that - in particular, if I ran "git fetch --refetch", I would fully
> expect the objects to be repacked, even if Git wasn't able to detect
> conclusively whether a pack was transmitted.
> 
> [1] https://lore.kernel.org/git/20241028225504.4151804-1-jonathantanmy@google.com/

A note for upstream, because I'm not sure it was ever explicitly
mentioned: at $DAYJOB, we saw this fetch recursion error as a
side-effect of the erroneous GC of local commits discussed at [2].

[2] https://lore.kernel.org/git/cover.1729792911.git.jonathantanmy@google.com/

> So I went back to my original idea of detecting when an object is
> missing. In trying to balance the concerns of both doing something as
> reasonable as possible in such a repo corruption case, and not slowing
> down and/or unnecessarily complicating the main code flow, I decided
> to detect when an object is present in the commit graph but not in the
> object DB, and to limit this detection for objects specified in the
> fetch refspec.
> 
> Upon detection, we can't fix it due to reasons mentioned in the commit
> message, so I decided to print a warning. An alternate option is to make
> it a fatal error (instead of a warning) if an object is detected to be
> in the commit graph but not the object DB. I haven't thought through the
> ramifications of that, though.

At first glance, I lean towards making this a fatal error, but I'll try
thinking out loud a bit:

First, we believe that [2] above should fix the root cause of the
particular case we saw at $DAYJOB (hopefully this type of error doesn't
have multiple root causes). So we expect to basically never encounter
this error again after [2] is merged and rolled out, and all existing
cases of repo corruption have been repaired. However, interacting with a
broken repo even with a client that includes [2] would still hit this
condition and issue a warning.

With the current implementation, fetching in a corrupt repo would still
cause git-fetch to infinitely recurse, and therefore would repeatedly
print the same error message to the console, until either the user
noticed, or we fail to launch a new git-fetch process due to resource
exhaustion.

I don't see any reason why the above situation is more friendly or
desirable than exiting (with the same error message) as soon as we
detect this type of corruption. However, I don't feel super strongly
about it. If the rest of the list is OK with repeated error messages,
then I can live with it.

> Jonathan Tan (2):
>   Revert "fetch-pack: add a deref_without_lazy_fetch_extended()"
>   fetch-pack: warn if in commit graph but not obj db
> 
>  fetch-pack.c | 45 +++++++++++++++++++++++++--------------------
>  object.h     |  2 +-
>  2 files changed, 26 insertions(+), 21 deletions(-)
> 
> -- 
> 2.47.0.163.g1226f6d8fa-goog
> 
>