mbox series

[v2,0/9] diffcore-rename improvements

Message ID pull.929.v2.git.git.1607677728.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series diffcore-rename improvements | expand

Message

Philippe Blain via GitGitGadget Dec. 11, 2020, 9:08 a.m. UTC
This patch series includes 3 small code cleanups, 1 small bug fix (or
perhaps just a UI improvement -- it's a very minor issue either way), more
testcases, and 3 performance improvements. The first 7 patches are
relatively easy to review, while the second to last one...is a bit more
involved (but provided the biggest performance boost).

Changes since v1:

 * Inserted a patch at the beginning to rename num_create to
   num_destinations, as suggested by Junio. Also simplifies patch 2.
 * Used st_mult() to simplify the limit check
 * Dropped a patch Junio didn't care for
 * Inserted two new patches, 5 & 6, to add new testcases motivated by
   Junio's comments about register_rename_src() and correctness in the face
   of duplicate tree entries.
 * Modified patch 7 so that register_rename_src() behaves identically to
   current code, even when trees have duplicate entries
 * Added a new patch 9 which rips out the unnecessary duplicate entry checks
   for BOTH rename_src and rename_dst, and documents why they are safe to
   remove. This is also supported by the new testcases in patches 5 and 6. I
   also ran t4058 (which patches 5 and 6 add to) under valgrind afterwards
   and verified that no errors from diffcore-rename were reported (valgrind
   did report problems from cache-tree and unpack-trees, as documented in
   patch 6, but that was true before this patch series and probably has been
   that way for years or decades).
 * Miscellaneous small cleanups highlighted by Taylor and Junio

Elijah Newren (9):
  diffcore-rename: rename num_create to num_destinations
  diffcore-rename: avoid usage of global in too_many_rename_candidates()
  diffcore-rename: simplify limit check
  diffcore-rename: reduce jumpiness in progress counters
  t4058: add more tests and documentation for duplicate tree entry
    handling
  t4058: explore duplicate tree entry handling in a bit more detail
  diffcore-rename: simplify and accelerate register_rename_src()
  diffcore-rename: accelerate rename_dst setup
  diffcore-rename: remove unneccessary duplicate entry checks

 diffcore-rename.c          | 209 ++++++++++++++-----------------------
 t/t4058-diff-duplicates.sh | 114 +++++++++++++++++++-
 2 files changed, 192 insertions(+), 131 deletions(-)


base-commit: 3a0b884caba2752da0af626fb2de7d597c844e8b
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-929%2Fnewren%2Fdiffcore-rename-improvements-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-929/newren/diffcore-rename-improvements-v2
Pull-Request: https://github.com/git/git/pull/929

Range-diff vs v1:

  3:  30381addc5c !  1:  428d8204894 diffcore-rename: rename num_create to num_targets
     @@ Metadata
      Author: Elijah Newren <newren@gmail.com>
      
       ## Commit message ##
     -    diffcore-rename: rename num_create to num_targets
     +    diffcore-rename: rename num_create to num_destinations
      
     -    Files added since the base commit serve as targets for rename detection.
     -    While it is true that added files can be thought of as being "created"
     -    when they are added IF they have no pairing file that they were renamed
     -    from, and it is true we start out not knowing what the pairings are, it
     -    seems a little odd to think in terms of "file creation" when we are
     -    looking for "file renames".  Rename the variable to avoid this minor
     -    point of confusion.
     +    Our main data structures are rename_src and rename_dst.  For counters of
     +    these data structures, num_sources and num_destinations seem natural;
     +    definitely more so than using num_create for the latter.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
       ## diffcore-rename.c ##
     +@@ diffcore-rename.c: static void record_if_better(struct diff_score m[], struct diff_score *o)
     +  * 1 if we need to disable inexact rename detection;
     +  * 2 if we would be under the limit if we were given -C instead of -C -C.
     +  */
     +-static int too_many_rename_candidates(int num_create,
     ++static int too_many_rename_candidates(int num_destinations,
     + 				      struct diff_options *options)
     + {
     + 	int rename_limit = options->rename_limit;
     +@@ diffcore-rename.c: static int too_many_rename_candidates(int num_create,
     + 	 * This basically does a test for the rename matrix not
     + 	 * growing larger than a "rename_limit" square matrix, ie:
     + 	 *
     +-	 *    num_create * num_src > rename_limit * rename_limit
     ++	 *    num_destinations * num_src > rename_limit * rename_limit
     + 	 */
     + 	if (rename_limit <= 0)
     + 		rename_limit = 32767;
     +-	if ((num_create <= rename_limit || num_src <= rename_limit) &&
     +-	    ((uint64_t)num_create * (uint64_t)num_src
     ++	if ((num_destinations <= rename_limit || num_src <= rename_limit) &&
     ++	    ((uint64_t)num_destinations * (uint64_t)num_src
     + 	     <= (uint64_t)rename_limit * (uint64_t)rename_limit))
     + 		return 0;
     + 
     + 	options->needed_rename_limit =
     +-		num_src > num_create ? num_src : num_create;
     ++		num_src > num_destinations ? num_src : num_destinations;
     + 
     + 	/* Are we running under -C -C? */
     + 	if (!options->flags.find_copies_harder)
     +@@ diffcore-rename.c: static int too_many_rename_candidates(int num_create,
     + 			continue;
     + 		num_src++;
     + 	}
     +-	if ((num_create <= rename_limit || num_src <= rename_limit) &&
     +-	    ((uint64_t)num_create * (uint64_t)num_src
     ++	if ((num_destinations <= rename_limit || num_src <= rename_limit) &&
     ++	    ((uint64_t)num_destinations * (uint64_t)num_src
     + 	     <= (uint64_t)rename_limit * (uint64_t)rename_limit))
     + 		return 2;
     + 	return 1;
      @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
       	struct diff_queue_struct outq;
       	struct diff_score *mx;
       	int i, j, rename_count, skip_unmodified = 0;
      -	int num_create, dst_cnt;
     -+	int num_targets, dst_cnt;
     ++	int num_destinations, dst_cnt;
       	struct progress *progress = NULL;
       
       	if (!minimum_score)
     @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
       	 * files still remain as options for rename/copies!)
       	 */
      -	num_create = (rename_dst_nr - rename_count);
     -+	num_targets = (rename_dst_nr - rename_count);
     ++	num_destinations = (rename_dst_nr - rename_count);
       
       	/* All done? */
      -	if (!num_create)
     -+	if (!num_targets)
     ++	if (!num_destinations)
       		goto cleanup;
       
     --	switch (too_many_rename_candidates(num_create, rename_src_nr, options)) {
     -+	switch (too_many_rename_candidates(num_targets, rename_src_nr, options)) {
     +-	switch (too_many_rename_candidates(num_create, options)) {
     ++	switch (too_many_rename_candidates(num_destinations, options)) {
       	case 1:
       		goto cleanup;
       	case 2:
     @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
       	}
       
      -	mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_create), sizeof(*mx));
     -+	mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_targets), sizeof(*mx));
     ++	mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_destinations),
     ++		     sizeof(*mx));
       	for (dst_cnt = i = 0; i < rename_dst_nr; i++) {
       		struct diff_filespec *two = rename_dst[i].two;
       		struct diff_score *m;
  1:  95a3fcb8be0 !  2:  fc62f4c4f89 diffcore-rename: avoid usage of global in too_many_rename_candidates()
     @@ Metadata
       ## Commit message ##
          diffcore-rename: avoid usage of global in too_many_rename_candidates()
      
     -    too_many_rename_candidates() got the number of rename targets via an
     -    argument to the function, but the number of rename sources via a global
     -    variable.  That felt rather inconsistent.  Pass in the number of rename
     -    sources as an argument as well.
     +    too_many_rename_candidates() got the number of rename destinations via
     +    an argument to the function, but the number of rename sources via a
     +    global variable.  That felt rather inconsistent.  Pass in the number of
     +    rename sources as an argument as well.
      
          While we are at it... We had a local variable, num_src, that served two
     -    purposes.  Initially it was set to this global value, but later was used
     +    purposes.  Initially it was set to the global value, but later was used
          for counting a subset of the number of sources.  Since we now have a
          function argument for the former usage, introduce a clearer variable
          name for the latter usage.
     @@ diffcore-rename.c: static void record_if_better(struct diff_score m[], struct di
        * 1 if we need to disable inexact rename detection;
        * 2 if we would be under the limit if we were given -C instead of -C -C.
        */
     --static int too_many_rename_candidates(int num_create,
     -+static int too_many_rename_candidates(int num_targets, int num_sources,
     +-static int too_many_rename_candidates(int num_destinations,
     ++static int too_many_rename_candidates(int num_destinations, int num_sources,
       				      struct diff_options *options)
       {
       	int rename_limit = options->rename_limit;
     @@ diffcore-rename.c: static void record_if_better(struct diff_score m[], struct di
       
       	options->needed_rename_limit = 0;
       
     -@@ diffcore-rename.c: static int too_many_rename_candidates(int num_create,
     +@@ diffcore-rename.c: static int too_many_rename_candidates(int num_destinations,
       	 * This basically does a test for the rename matrix not
       	 * growing larger than a "rename_limit" square matrix, ie:
       	 *
     --	 *    num_create * num_src > rename_limit * rename_limit
     -+	 *    num_targets * num_sources > rename_limit * rename_limit
     +-	 *    num_destinations * num_src > rename_limit * rename_limit
     ++	 *    num_destinations * num_sources > rename_limit * rename_limit
       	 */
       	if (rename_limit <= 0)
       		rename_limit = 32767;
     --	if ((num_create <= rename_limit || num_src <= rename_limit) &&
     --	    ((uint64_t)num_create * (uint64_t)num_src
     -+	if ((num_targets <= rename_limit || num_sources <= rename_limit) &&
     -+	    ((uint64_t)num_targets * (uint64_t)num_sources
     +-	if ((num_destinations <= rename_limit || num_src <= rename_limit) &&
     +-	    ((uint64_t)num_destinations * (uint64_t)num_src
     ++	if ((num_destinations <= rename_limit || num_sources <= rename_limit) &&
     ++	    ((uint64_t)num_destinations * (uint64_t)num_sources
       	     <= (uint64_t)rename_limit * (uint64_t)rename_limit))
       		return 0;
       
       	options->needed_rename_limit =
     --		num_src > num_create ? num_src : num_create;
     -+		num_sources > num_targets ? num_sources : num_targets;
     +-		num_src > num_destinations ? num_src : num_destinations;
     ++		num_sources > num_destinations ? num_sources : num_destinations;
       
       	/* Are we running under -C -C? */
       	if (!options->flags.find_copies_harder)
     @@ diffcore-rename.c: static int too_many_rename_candidates(int num_create,
      -		num_src++;
      +		limited_sources++;
       	}
     --	if ((num_create <= rename_limit || num_src <= rename_limit) &&
     --	    ((uint64_t)num_create * (uint64_t)num_src
     -+	if ((num_targets <= rename_limit || limited_sources <= rename_limit) &&
     -+	    ((uint64_t)num_targets * (uint64_t)limited_sources
     +-	if ((num_destinations <= rename_limit || num_src <= rename_limit) &&
     +-	    ((uint64_t)num_destinations * (uint64_t)num_src
     ++	if ((num_destinations <= rename_limit || limited_sources <= rename_limit) &&
     ++	    ((uint64_t)num_destinations * (uint64_t)limited_sources
       	     <= (uint64_t)rename_limit * (uint64_t)rename_limit))
       		return 2;
       	return 1;
      @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
     - 	if (!num_create)
     + 	if (!num_destinations)
       		goto cleanup;
       
     --	switch (too_many_rename_candidates(num_create, options)) {
     -+	switch (too_many_rename_candidates(num_create, rename_src_nr, options)) {
     +-	switch (too_many_rename_candidates(num_destinations, options)) {
     ++	switch (too_many_rename_candidates(num_destinations, rename_src_nr,
     ++					   options)) {
       	case 1:
       		goto cleanup;
       	case 2:
  2:  f96bb36930a !  3:  7214fa73fdd diffcore-rename: remove unnecessary if-clause
     @@ Metadata
      Author: Elijah Newren <newren@gmail.com>
      
       ## Commit message ##
     -    diffcore-rename: remove unnecessary if-clause
     +    diffcore-rename: simplify limit check
      
          diffcore-rename had two different checks of the form
      
              if ((a < limit || b < limit) &&
                  a * b <= limit * limit)
      
     -    Since these are all non-negative integers, this can be simplified to
     +    This can be simplified to
      
     -        if (a * b <= limit * limit)
     +        if (st_mult(a, b) <= st_mult(limit, limit))
      
     -    The only advantage of the former would be in avoiding a couple
     -    multiplications in the rare case that both a and b are BOTH very large.
     -    I see no reason for such an optimization given that this code is not in
     -    any kind of loop.  Prefer code simplicity here and change to the latter
     -    form.
     +    which makes it clearer how we are checking for overflow, and makes it
     +    much easier to parse given the drop from 8 to 4 variable appearances.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
       ## diffcore-rename.c ##
     -@@ diffcore-rename.c: static int too_many_rename_candidates(int num_targets, int num_sources,
     +@@ diffcore-rename.c: static int too_many_rename_candidates(int num_destinations, int num_sources,
     + 	 * growing larger than a "rename_limit" square matrix, ie:
     + 	 *
     + 	 *    num_destinations * num_sources > rename_limit * rename_limit
     ++	 *
     ++	 * We use st_mult() to check overflow conditions; in the
     ++	 * exceptional circumstance that size_t isn't large enough to hold
     ++	 * the multiplication, the system won't be able to allocate enough
     ++	 * memory for the matrix anyway.
       	 */
       	if (rename_limit <= 0)
       		rename_limit = 32767;
     --	if ((num_targets <= rename_limit || num_sources <= rename_limit) &&
     --	    ((uint64_t)num_targets * (uint64_t)num_sources
     +-	if ((num_destinations <= rename_limit || num_sources <= rename_limit) &&
     +-	    ((uint64_t)num_destinations * (uint64_t)num_sources
      -	     <= (uint64_t)rename_limit * (uint64_t)rename_limit))
     -+	if ((uint64_t)num_targets * (uint64_t)num_sources
     -+	    <= (uint64_t)rename_limit * (uint64_t)rename_limit)
     ++	if (st_mult(num_destinations, num_sources)
     ++	    <= st_mult(rename_limit, rename_limit))
       		return 0;
       
       	options->needed_rename_limit =
     -@@ diffcore-rename.c: static int too_many_rename_candidates(int num_targets, int num_sources,
     +@@ diffcore-rename.c: static int too_many_rename_candidates(int num_destinations, int num_sources,
       			continue;
       		limited_sources++;
       	}
     --	if ((num_targets <= rename_limit || limited_sources <= rename_limit) &&
     --	    ((uint64_t)num_targets * (uint64_t)limited_sources
     +-	if ((num_destinations <= rename_limit || limited_sources <= rename_limit) &&
     +-	    ((uint64_t)num_destinations * (uint64_t)limited_sources
      -	     <= (uint64_t)rename_limit * (uint64_t)rename_limit))
     -+	if ((uint64_t)num_targets * (uint64_t)limited_sources
     -+	    <= (uint64_t)rename_limit * (uint64_t)rename_limit)
     ++	if (st_mult(num_destinations, limited_sources)
     ++	    <= st_mult(rename_limit, rename_limit))
       		return 2;
       	return 1;
       }
  4:  77ca4656ed4 <  -:  ----------- diffcore-rename: change a few comments to use 'add' instead of 'create'
  5:  e8257516c1f !  4:  bf3581b1ac1 diffcore-rename: reduce jumpiness in progress counters
     @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
       		progress = start_delayed_progress(
       				_("Performing inexact rename detection"),
      -				(uint64_t)rename_dst_nr * (uint64_t)rename_src_nr);
     -+				(uint64_t)num_targets * (uint64_t)rename_src_nr);
     ++				(uint64_t)num_destinations * (uint64_t)rename_src_nr);
       	}
       
     - 	mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_targets), sizeof(*mx));
     + 	mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_destinations),
      @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
       			diff_free_filespec_blob(two);
       		}
  -:  ----------- >  5:  9a4a3159acf t4058: add more tests and documentation for duplicate tree entry handling
  -:  ----------- >  6:  8db27892c59 t4058: explore duplicate tree entry handling in a bit more detail
  6:  306a48820dd !  7:  a58639b2927 diffcore-rename: simplify and accelerate register_rename_src()
     @@ Metadata
       ## Commit message ##
          diffcore-rename: simplify and accelerate register_rename_src()
      
     -    reigster_rename_src() took pains to create an array in rename_src which
     -    was sorted by pathname of the contained diff_filepair.  However, the
     -    fact that this array was sorted was not needed anywhere, and thus
     -    represented wasted time.  Simply append to the end of the array, which
     -    in a usecase of note saved 45% of diffcore_rename() setup time for me.
     -
     -    Technically, there is one difference in the end result for the code.  IF
     -    the caller of diffcore-rename provides a set of pairs with a duplicate
     -    filename in the sources (an erroneous input condition), the old code
     -    would simply ignore the duplicate (without warning or error).  The new
     -    code will simply swallow it and thus allow multiple pairings for the
     -    "same" source file.  Checking for this error condition is expensive
     -    (basically undoing the optimization) and the only existing callers
     -    already avoid passing such an invalid input.  If someone really wants to
     -    add some error checking here on this input condition, I believe they
     -    should add a separate function which can be called before
     -    diffcore_rename(), and then other callers that don't need additional
     -    checks because of their design (such as merge-ort) can avoid the extra
     -    overhead.
     +    register_rename_src() took pains to create an array in rename_src which
     +    was sorted by pathname of the contained diff_filepair.  The sorting was
     +    entirely unnecessary since callers pass filepairs to us in sorted
     +    order.  We can simply append to the end of the rename_src array,
     +    speeding up diffcore_rename() setup time.
      
          Also, note that I dropped the return type on the function since it was
          unconditionally discarded anyway.
     @@ diffcore-rename.c: static struct diff_rename_src {
      -		}
      -		first = next+1;
      -	}
     --
     ++	/*
     ++	 * If we have multiple entries at the same path in the source tree
     ++	 * (an invalid tree, to be sure), avoid using more more than one
     ++	 * such entry in rename detection.  Once upon a time, doing so
     ++	 * caused segfaults; see commit 25d5ea410f ("[PATCH] Redo
     ++	 * rename/copy detection logic.", 2005-05-24).
     ++	 */
     ++	if (rename_src_nr > 0 &&
     ++	    !strcmp(rename_src[rename_src_nr-1].p->one->path, p->one->path))
     ++		return;
     + 
      -	/* insert to make it at "first" */
       	ALLOC_GROW(rename_src, rename_src_nr + 1, rename_src_alloc);
      +	rename_src[rename_src_nr].p = p;
  7:  3e83b51628b !  8:  85714e1583d Accelerate rename_dst setup
     @@ Metadata
      Author: Elijah Newren <newren@gmail.com>
      
       ## Commit message ##
     -    Accelerate rename_dst setup
     +    diffcore-rename: accelerate rename_dst setup
      
          register_rename_src() simply references the passed pair inside
          rename_src.  In contrast, add_rename_dst() did something entirely
     @@ Commit message
          been done in a weakly quadratic manner, when many renames are involved
          this time could add up.
      
     -    There is still a strcmp() in add_rename_dst() that I feel is
     -    unnecessary, because I disagree with a regression test in t4058.
     -    However, I'm trying to increase performance without changing behavior,
     -    so I'll leave that fight for a different day (a TODO has been left in
     -    the code about it).
     +    There is still a strcmp() in add_rename_dst() that I have left in place
     +    to make it easier to verify that the algorithm has the same results.
     +    This strcmp() is there to check for duplicate destination entries (which
     +    was the easiest way at the time to avoid segfaults in the
     +    diffcore-rename code when trees had multiple entries at a given path).
     +    The underlying double free()s are no longer an issue with the new
     +    algorithm, but that can be addressed in a subsequent commit.
      
          This patch is being submitted in a different order than its original
          development, but in a large rebase of many commits with lots of renames
     @@ Commit message
      
       ## diffcore-rename.c ##
      @@
     - #include "cache.h"
     - #include "diff.h"
     - #include "diffcore.h"
     --#include "object-store.h"
       #include "hashmap.h"
     -+#include "object-store.h"
       #include "progress.h"
       #include "promisor-remote.h"
      +#include "strmap.h"
     @@ diffcore-rename.c
      -
      -	if (first >= 0)
      +	/*
     -+	 * See t4058; trees might have duplicate entries.  I think
     -+	 * trees with duplicate entries should be ignored and we
     -+	 * should just leave rename detection on; while the results
     -+	 * may be slightly harder to understand, that's merely a
     -+	 * result of the underlying tree making no sense.  But I
     -+	 * believe everything still works fine, the results do still
     -+	 * make sense, and the extra overhead of doing this checking
     -+	 * for a few historical weird trees from long ago seems like
     -+	 * the dog wagging the tail to me.
     -+	 *
     -+	 * However: I don't feel like fighting that battle right now.
     -+	 * For now, to keep the regression test passing, we have to
     -+	 * detect it.  Since the diff machinery passes these to us in
     -+	 * adjacent pairs, we just need to check to see if our name
     -+	 * matches the previous one.
     -+	 *
     -+	 * TODO: Dispense with this test, rip out the test in t4058, and
     -+	 * lobby folks for the change.
     ++	 * If we have multiple entries at the same path in the destination
     ++	 * tree (an invalid tree, to be sure), turn off rename detection
     ++	 * entirely.  Once upon a time, diffcore-rename had double free()
     ++	 * issues due to such duplicate paths, resulting in segfaults.  See
     ++	 * commit 4d6be03b95 ("diffcore-rename: avoid processing duplicate
     ++	 * destinations", 2015-02-26).
      +	 */
      +	if (rename_dst_nr > 0 &&
      +	    !strcmp(rename_dst[rename_dst_nr-1].p->two->path, p->two->path))
     @@ diffcore-rename.c
       	return 0;
       }
       
     -@@ diffcore-rename.c: static int rename_src_nr, rename_src_alloc;
     +@@ diffcore-rename.c: static void register_rename_src(struct diff_filepair *p)
     + 	    !strcmp(rename_src[rename_src_nr-1].p->one->path, p->one->path))
     + 		return;
       
     - static void register_rename_src(struct diff_filepair *p)
     - {
      +	if (p->broken_pair) {
      +		if (!break_idx) {
      +			break_idx = xmalloc(sizeof(*break_idx));
     @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
       					" duplicate destination '%s'",
       					p->two->path);
      @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
     - 
     - 	mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_targets), sizeof(*mx));
     + 	mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_destinations),
     + 		     sizeof(*mx));
       	for (dst_cnt = i = 0; i < rename_dst_nr; i++) {
      -		struct diff_filespec *two = rename_dst[i].two;
      +		struct diff_filespec *two = rename_dst[i].p->two;
     @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
       		}
       		else if (!DIFF_FILE_VALID(p->one) && DIFF_FILE_VALID(p->two)) {
      -			/*
     --			 * Addition
     +-			 * Creation
      -			 *
     --			 * We would output this add record if it has
     +-			 * We would output this create record if it has
      -			 * not been turned into a rename/copy already.
      -			 */
      -			struct diff_rename_dst *dst = locate_rename_dst(p->two);
     @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
      -			}
      -			else
      -				/* no matching rename/copy source, so
     --				 * record this as an addition.
     +-				 * record this as a creation.
      -				 */
      -				diff_q(&outq, p);
     -+			/* Addition */
     ++			/* Creation */
      +			diff_q(&outq, p);
       		}
       		else if (DIFF_FILE_VALID(p->one) && !DIFF_FILE_VALID(p->two)) {
     @@ diffcore-rename.c: void diffcore_rename(struct diff_options *options)
      -				struct diff_rename_dst *dst = locate_rename_dst(p->one);
      -				if (dst && dst->pair)
      +				struct diff_rename_dst *dst = locate_rename_dst(p);
     -+				assert(dst);
     ++				if (!dst)
     ++					BUG("tracking failed somehow; failed to find associated dst for broken pair");
      +				if (dst->is_rename)
       					/* counterpart is now rename/copy */
       					pair_to_free = p;
  -:  ----------- >  9:  af256b1c534 diffcore-rename: remove unneccessary duplicate entry checks