Message ID | 0131d21f-dabd-3da5-34bd-a570e990f9e0@web.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | add: don't write objects with --dry-run | expand |
René Scharfe <l.s.r@web.de> writes: > When the option --dry-run/-n is given, "git add" doesn't change the > index, but still writes out new object files. Only hash the latter > without writing instead to make the run as dry as possible. > > Use this opportunity to also make the hash_flags variable unsigned, > to match the index_path() parameter it is used as. > > Reported-by: git.mexon@spamgourmet.com > Signed-off-by: René Scharfe <l.s.r@web.de> > --- > Am I missing something? Do we sometimes rely on the written objects > within the "git add --dry-run" command? Good question. I do not think of anything offhand, but this obvious "omission" makes me suspect that we may be forgetting something. Thanks. > read-cache.c | 2 +- > t/t2200-add-update.sh | 3 +++ > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/read-cache.c b/read-cache.c > index a78b88a41b..7fcc948077 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -738,7 +738,7 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st, > int intent_only = flags & ADD_CACHE_INTENT; > int add_option = (ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE| > (intent_only ? ADD_CACHE_NEW_ONLY : 0)); > - int hash_flags = HASH_WRITE_OBJECT; > + unsigned hash_flags = pretend ? 0 : HASH_WRITE_OBJECT; > struct object_id oid; > > if (flags & ADD_CACHE_RENORMALIZE) > diff --git a/t/t2200-add-update.sh b/t/t2200-add-update.sh > index 45ca35d60a..94c4cb0672 100755 > --- a/t/t2200-add-update.sh > +++ b/t/t2200-add-update.sh > @@ -129,12 +129,15 @@ test_expect_success 'add -n -u should not add but just report' ' > echo "remove '\''top'\''" > ) >expect && > before=$(git ls-files -s check top) && > + git count-objects -v >objects_before && > echo changed >>check && > rm -f top && > git add -n -u >actual && > after=$(git ls-files -s check top) && > + git count-objects -v >objects_after && > > test "$before" = "$after" && > + test_cmp objects_before objects_after && > test_cmp expect actual > > ' > -- > 2.33.0
On Tue, Oct 12 2021, René Scharfe wrote: > When the option --dry-run/-n is given, "git add" doesn't change the > index, but still writes out new object files. Only hash the latter > without writing instead to make the run as dry as possible. > > Use this opportunity to also make the hash_flags variable unsigned, > to match the index_path() parameter it is used as. > > Reported-by: git.mexon@spamgourmet.com > Signed-off-by: René Scharfe <l.s.r@web.de> > --- > Am I missing something? Do we sometimes rely on the written objects > within the "git add --dry-run" command? Probably not, here's a semi-related patch of mine that never got integrated. E.g. you'll probably find that even if you're not writing objects we're still doing things like zlib compression here too (or not, I haven't looked): https://lore.kernel.org/git/20190520222932.22843-1-avarab@gmail.com/ I think the "git fetch --dry-run" command behaves like this too, i.e. doesn't update refs, but fetches and writes objects. For the patch I hacked up I think it's easy to argue that it shouldn't do compression etc. For this sort of thing and "fetch" I'm not so sure. Do we really know that there aren't people who rely on this for say the performance of seeing what an operation would do, and then not pay as much for the "real one" that updates the index/refs/etc. later? Is that subsequent "fetch" cheaper because of the --dry-run? Maybe not, but it seems like something to look into.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > I think the "git fetch --dry-run" command behaves like this too, > i.e. doesn't update refs, but fetches and writes objects. > > For the patch I hacked up I think it's easy to argue that it shouldn't > do compression etc. > > For this sort of thing and "fetch" I'm not so sure. Do we really know > that there aren't people who rely on this for say the performance of > seeing what an operation would do, and then not pay as much for the > "real one" that updates the index/refs/etc. later? Is that subsequent > "fetch" cheaper because of the --dry-run? The answer to the last one is an easy "yes". Trying to gauge the time it would take for a real fetch with "--dry-run" is a losing battle, I would think, as the pre-fetching would make the "real" one cheaper, so from that point of view, I think we can ignore those who time "--dry-run" and try to figure out anything meaningful. This in any case is an interesting area, as the definition of correctness of what "dry-run" does can be quite fuzzy. As long as it does not change the index, "git add --dry-run", even if it writes objects or detects filesystem corruption by noticing I/O error while compressing the data taken from the working tree files, is still correct and the patch in question is not technically a bugfix (it is a performance thing). "git fetch --dry-run" would fall into the same category, so would "git hash-object" without "-w". All can use performance enhancement without breaking existing users, I would think. Thanks.
diff --git a/read-cache.c b/read-cache.c index a78b88a41b..7fcc948077 100644 --- a/read-cache.c +++ b/read-cache.c @@ -738,7 +738,7 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st, int intent_only = flags & ADD_CACHE_INTENT; int add_option = (ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE| (intent_only ? ADD_CACHE_NEW_ONLY : 0)); - int hash_flags = HASH_WRITE_OBJECT; + unsigned hash_flags = pretend ? 0 : HASH_WRITE_OBJECT; struct object_id oid; if (flags & ADD_CACHE_RENORMALIZE) diff --git a/t/t2200-add-update.sh b/t/t2200-add-update.sh index 45ca35d60a..94c4cb0672 100755 --- a/t/t2200-add-update.sh +++ b/t/t2200-add-update.sh @@ -129,12 +129,15 @@ test_expect_success 'add -n -u should not add but just report' ' echo "remove '\''top'\''" ) >expect && before=$(git ls-files -s check top) && + git count-objects -v >objects_before && echo changed >>check && rm -f top && git add -n -u >actual && after=$(git ls-files -s check top) && + git count-objects -v >objects_after && test "$before" = "$after" && + test_cmp objects_before objects_after && test_cmp expect actual '
When the option --dry-run/-n is given, "git add" doesn't change the index, but still writes out new object files. Only hash the latter without writing instead to make the run as dry as possible. Use this opportunity to also make the hash_flags variable unsigned, to match the index_path() parameter it is used as. Reported-by: git.mexon@spamgourmet.com Signed-off-by: René Scharfe <l.s.r@web.de> --- Am I missing something? Do we sometimes rely on the written objects within the "git add --dry-run" command? read-cache.c | 2 +- t/t2200-add-update.sh | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) -- 2.33.0