mbox series

[v3,0/3] Performance improvements for repacking non-promisor objects

Message ID cover.1733262661.git.jonathantanmy@google.com (mailing list archive)
Headers show
Series Performance improvements for repacking non-promisor objects | expand

Message

Jonathan Tan Dec. 3, 2024, 9:52 p.m. UTC
Apparently I did not save in my text editor (and didn't notice because
the code comment was still valid syntactically, so everything still
compiled). Here's a version with the updated and correctly formatted
code comment.

Jonathan Tan (3):
  index-pack --promisor: dedup before checking links
  index-pack --promisor: don't check blobs
  index-pack --promisor: also check commits' trees

 builtin/index-pack.c | 103 +++++++++++++++++++++++++++++--------------
 1 file changed, 71 insertions(+), 32 deletions(-)

Range-diff against v2:
1:  7ae21c921f = 1:  7ae21c921f index-pack --promisor: dedup before checking links
2:  5a63c9a5ca ! 2:  a1d2a20203 index-pack --promisor: don't check blobs
    @@ builtin/index-pack.c: static void record_outgoing_link(const struct object_id *o
     +static void maybe_record_name_entry(const struct name_entry *entry)
     +{
     +	/*
    -+	 * The benefit of doing this is as above (fetch speedup), but the drawback
    -+is that if the packfile to be indexed references a local blob directly
    -+(that is, not through a local tree), that local blob is in danger of
    -+being garbage collected. Such a situation may arise if we push local
    -+commits, including one with a change to a blob in the root tree,
    -+and then the server incorporates them into its main branch through a
    -+"rebase" or "squash" merge strategy, and then we fetch the new main
    -+branch from the server.
    -+
    -+This situation has not been observed yet - we have only noticed missing
    -+commits, not missing trees or blobs. (In fact, if it were believed that
    -+only missing commits are problematic, one could argue that we should
    -+also exclude trees during the outgoing link check; but it is safer to
    -+include them.)
    -+
    -+Due to the rarity of the situation (it has not been observed to happen
    -+in real life), and because the "penalty" in such a situation is merely
    -+to refetch the missing blob when it's needed, the tradeoff seems
    -+worth it.
    ++	 * Checking only trees here results in a significantly faster packfile
    ++	 * indexing, but the drawback is that if the packfile to be indexed
    ++	 * references a local blob only directly (that is, never through a
    ++	 * local tree), that local blob is in danger of being garbage
    ++	 * collected. Such a situation may arise if we push local commits,
    ++	 * including one with a change to a blob in the root tree, and then the
    ++	 * server incorporates them into its main branch through a "rebase" or
    ++	 * "squash" merge strategy, and then we fetch the new main branch from
    ++	 * the server.
    ++	 *
    ++	 * This situation has not been observed yet - we have only noticed
    ++	 * missing commits, not missing trees or blobs. (In fact, if it were
    ++	 * believed that only missing commits are problematic, one could argue
    ++	 * that we should also exclude trees during the outgoing link check;
    ++	 * but it is safer to include them.)
    ++	 *
    ++	 * Due to the rarity of the situation (it has not been observed to
    ++	 * happen in real life), and because the "penalty" in such a situation
    ++	 * is merely to refetch the missing blob when it's needed (and this
    ++	 * happens only once - when refetched, the blob goes into a promisor
    ++	 * pack, so it won't be GC-ed, the tradeoff seems worth it.
     +	*/
     +	if (S_ISDIR(entry->mode))
     +		record_outgoing_link(&entry->oid);
3:  8139325bf2 = 3:  f9f9969a8f index-pack --promisor: also check commits' trees

Comments

Junio C Hamano Dec. 4, 2024, 2:22 a.m. UTC | #1
Jonathan Tan <jonathantanmy@google.com> writes:

> Apparently I did not save in my text editor (and didn't notice because
> the code comment was still valid syntactically, so everything still
> compiled). Here's a version with the updated and correctly formatted
> code comment.

Thanks.  Will replace.