Message ID | e8257516c1f983d590b8a81fb32ecd5cb91dc737.1607223276.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | diffcore-rename improvements | expand |
On Sun, Dec 06, 2020 at 02:54:34AM +0000, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > > Inexact rename detection works by comparing all sources to all > destinations, computing similarities, and then finding the best matches > among those that are sufficiently similar. > > However, it is preceded by exact rename detection that works by > checking if there are files with identical hashes. If exact renames are > found, we can exclude some files from inexact rename detection. > > The inexact rename detection loops over the full set of files, but > immediately skips those for which rename_dst[i].is_rename is true and > thus doesn't compare any sources to that destination. As such, these > paths shouldn't be included in the progress counter. > > For the eagle eyed, this change hints at an actual optimization -- the > first one I presented at Git Merge 2020. I'll be submitting that > optimization later, once the basic merge-ort algorithm has merged. ;-) I was hoping that that was the case. > Signed-off-by: Elijah Newren <newren@gmail.com> > --- > diffcore-rename.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/diffcore-rename.c b/diffcore-rename.c > index 7270eb6af48..3d637ba4645 100644 > --- a/diffcore-rename.c > +++ b/diffcore-rename.c > @@ -587,7 +587,7 @@ void diffcore_rename(struct diff_options *options) > if (options->show_rename_progress) { > progress = start_delayed_progress( > _("Performing inexact rename detection"), > - (uint64_t)rename_dst_nr * (uint64_t)rename_src_nr); > + (uint64_t)num_targets * (uint64_t)rename_src_nr); > } > > mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_targets), sizeof(*mx)); > @@ -626,7 +626,8 @@ void diffcore_rename(struct diff_options *options) > diff_free_filespec_blob(two); > } > dst_cnt++; > - display_progress(progress, (uint64_t)(i+1)*(uint64_t)rename_src_nr); > + display_progress(progress, > + (uint64_t)dst_cnt * (uint64_t)rename_src_nr); Both of these spots need only one cast, but the patch itself looks correct (with or without dropping a cast on one of the operands in each expression). Thanks, Taylor
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Elijah Newren <newren@gmail.com> > > Inexact rename detection works by comparing all sources to all > destinations, computing similarities, and then finding the best matches > among those that are sufficiently similar. Here, you are using <sources, destinations> as contrasting pair of words, which supports my comment on 1/7 and 3/7 ;-) > However, it is preceded by exact rename detection that works by > checking if there are files with identical hashes. If exact renames are > found, we can exclude some files from inexact rename detection. > > The inexact rename detection loops over the full set of files, but > immediately skips those for which rename_dst[i].is_rename is true and > thus doesn't compare any sources to that destination. As such, these > paths shouldn't be included in the progress counter. > ... > diff --git a/diffcore-rename.c b/diffcore-rename.c > index 7270eb6af48..3d637ba4645 100644 > --- a/diffcore-rename.c > +++ b/diffcore-rename.c > @@ -587,7 +587,7 @@ void diffcore_rename(struct diff_options *options) > if (options->show_rename_progress) { > progress = start_delayed_progress( > _("Performing inexact rename detection"), > - (uint64_t)rename_dst_nr * (uint64_t)rename_src_nr); > + (uint64_t)num_targets * (uint64_t)rename_src_nr); > } The num_targets (number of destinations) holds the "remaining" candidates after exact renames are taken care of, so this reduces the size of the matrix used to count the progress meter. OK. > mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_targets), sizeof(*mx)); > @@ -626,7 +626,8 @@ void diffcore_rename(struct diff_options *options) > diff_free_filespec_blob(two); > } > dst_cnt++; > - display_progress(progress, (uint64_t)(i+1)*(uint64_t)rename_src_nr); > + display_progress(progress, > + (uint64_t)dst_cnt * (uint64_t)rename_src_nr); And this fills the progress meter by using the number of destinations that are actually considered. Between the two hunks, there is a "if the source of this destination is already known, move to next 'i'" continue, that does not update the progress meter. Changing (i+1) to dst_cnt here compensates for the reduction of the matrix size we see above. Makes sense. This looks good. > } > stop_progress(&progress);
diff --git a/diffcore-rename.c b/diffcore-rename.c index 7270eb6af48..3d637ba4645 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -587,7 +587,7 @@ void diffcore_rename(struct diff_options *options) if (options->show_rename_progress) { progress = start_delayed_progress( _("Performing inexact rename detection"), - (uint64_t)rename_dst_nr * (uint64_t)rename_src_nr); + (uint64_t)num_targets * (uint64_t)rename_src_nr); } mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_targets), sizeof(*mx)); @@ -626,7 +626,8 @@ void diffcore_rename(struct diff_options *options) diff_free_filespec_blob(two); } dst_cnt++; - display_progress(progress, (uint64_t)(i+1)*(uint64_t)rename_src_nr); + display_progress(progress, + (uint64_t)dst_cnt * (uint64_t)rename_src_nr); } stop_progress(&progress);