mbox series

[v3,0/3] grep: integrate with sparse index

Message ID 20220901045736.523371-1-shaoxuan.yuan02@gmail.com (mailing list archive)
Headers show
Series grep: integrate with sparse index | expand

Message

Shaoxuan Yuan Sept. 1, 2022, 4:57 a.m. UTC
Integrate `git-grep` with sparse-index and test the performance
improvement.

Changes since v2
----------------

* Modify the commit message for "builtin/grep.c: integrate with sparse
  index" to make it obvious that the perf test results are not from
  p2000 tests, but from manual perf runs.

* Add tree-walking logic as an extra (the third) patch to improve the
  performance when --sparse is used. This resolved the left-over-bit
  in v2 [1].

[1] https://lore.kernel.org/git/20220829232843.183711-1-shaoxuan.yuan02@gmail.com/

Changes since v1
----------------

* Rewrite the commit message for "builtin/grep.c: add --sparse option"
  to be clearer.

* Update the documentation (both in-code and man page) for --sparse.

* Add a few tests to test the new behavior (when _only_ --cached is
  supplied).

* Reformat the perf test results to not look like directly from p2000
  tests.

* Put the "command_requires_full_index" lines right after parse_options().

* Add a pathspec test in t1092, and reword a few test documentations.

Shaoxuan Yuan (3):
  builtin/grep.c: add --sparse option
  builtin/grep.c: integrate with sparse index
  builtin/grep.c: walking tree instead of expanding index with --sparse

 Documentation/git-grep.txt               |  5 ++-
 builtin/grep.c                           | 46 +++++++++++++++++++++---
 t/perf/p2000-sparse-operations.sh        |  1 +
 t/t1092-sparse-checkout-compatibility.sh | 18 ++++++++++
 t/t7817-grep-sparse-checkout.sh          | 34 ++++++++++++++----
 5 files changed, 92 insertions(+), 12 deletions(-)

Range-diff against v2:
1:  ab5ff488a1 = 1:  db1f5a5409 builtin/grep.c: add --sparse option
2:  68c7ecee73 ! 2:  af566c7862 builtin/grep.c: integrate with sparse index
    @@ Commit message
     
         Turn on sparse index and remove ensure_full_index().
     
    -    Change it to only expands the index when using --sparse.
    +    Change it to only expand the index when using --sparse.
     
    -    The p2000 tests demonstrate a ~99.4% execution time reduction for
    +    The p2000 tests do not demonstrate a significant improvement,
    +    because the index read is a small portion of the full process
    +    time, compared to the blob parsing. The times below reflect the
    +    time spent in the "do_read_index" trace region as shown using
    +    GIT_TRACE2_PERF=1.
    +
    +    The tests demonstrate a ~99.4% execution time reduction for
         `git grep` using a sparse index.
     
    -    Test                                  Before       After
    +    Test                                  HEAD~        HEAD
         -----------------------------------------------------------------------------
         git grep --cached bogus (full-v3)     0.019        0.018  (-5.2%)
         git grep --cached bogus (full-v4)     0.017        0.016  (-5.8%)
    @@ builtin/grep.c: int cmd_grep(int argc, const char **argv, const char *prefix)
      		int fallback = 0;
      		git_config_get_bool("grep.fallbacktonoindex", &fallback);
     
    - ## t/perf/p2000-sparse-operations.sh ##
    -@@ t/perf/p2000-sparse-operations.sh: test_perf_on_all git read-tree -mu HEAD
    - test_perf_on_all git checkout-index -f --all
    - test_perf_on_all git update-index --add --remove $SPARSE_CONE/a
    - test_perf_on_all "git rm -f $SPARSE_CONE/a && git checkout HEAD -- $SPARSE_CONE/a"
    -+test_perf_on_all git grep --cached bogus
    - 
    - test_done
    -
      ## t/t1092-sparse-checkout-compatibility.sh ##
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is not expanded: rm' '
      	ensure_not_expanded rm -r deep
-:  ---------- > 3:  757ac7ddee builtin/grep.c: walking tree instead of expanding index with --sparse

base-commit: d42b38dfb5edf1a7fddd9542d722f91038407819