Message ID | a3bd253fa8e8ae47d19beb35327d8283ffa49289.1620432500.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Directory traversal fixes | expand |
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Elijah Newren <newren@gmail.com> > > PNPM is apparently creating deeply nested (but ignored) directory Sorry, but what's PNPM? > structures; traversing them is costly performance-wise, unnecessary, and > in some cases is even throwing warnings/errors because the paths are too > long to handle on various platforms. Add a testcase that demonstrates > this problem. > > Initial-test-by: Jason Gore <Jason.Gore@microsoft.com> > Helped-by: brian m. carlson <sandals@crustytoothpaste.net> > Signed-off-by: Elijah Newren <newren@gmail.com> > --- > t/t7300-clean.sh | 40 ++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 40 insertions(+) > > diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh > index a74816ca8b46..5f1dc397c11e 100755 > --- a/t/t7300-clean.sh > +++ b/t/t7300-clean.sh > @@ -746,4 +746,44 @@ test_expect_success 'clean untracked paths by pathspec' ' > test_must_be_empty actual > ' > > +test_expect_failure 'avoid traversing into ignored directories' ' > + test_when_finished rm -f output error && > + test_create_repo avoid-traversing-deep-hierarchy && > + ( > + cd avoid-traversing-deep-hierarchy && > + > + >directory-random-file.txt && > + # Put this file under directory400/directory399/.../directory1/ > + depth=400 && > + for x in $(test_seq 1 $depth); do Style. Lose semicolon, have "do" on the next line on its own, aligned with "for". Tip: you shouldn't need any semicolon other than the doubled ones in case/esac in your shell script. > + mkdir "tmpdirectory$x" && > + mv directory* "tmpdirectory$x" && > + mv "tmpdirectory$x" "directory$x" > + done && > + > + git clean -ffdxn -e directory$depth >../output 2>../error && > + > + test_must_be_empty ../output && > + # We especially do not want things like > + # "warning: could not open directory " > + # appearing in the error output. It is true that directories > + # that are too long cannot be opened, but we should not be > + # recursing into those directories anyway since the very first > + # level is ignored. > + test_must_be_empty ../error && > + > + # alpine-linux-musl fails to "rm -rf" a directory with such > + # a deeply nested hierarchy. Help it out by deleting the > + # leading directories ourselves. Super slow, but, what else > + # can we do? Without this, we will hit a > + # error: Tests passed but test cleanup failed; aborting > + # so do this ugly manual cleanup... > + while test ! -f directory-random-file.txt; do Ditto. > + name=$(ls -d directory*) && > + mv $name/* . && > + rmdir $name > + done Hmph, after seeing the discussion thread of v1, I was expecting to see a helper in Perl that cd's down and then comes back up while removing what is in its directory (and I expected something similar for creation side we saw above). > + ) > +' > + > test_done
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > + # alpine-linux-musl fails to "rm -rf" a directory with such > + # a deeply nested hierarchy. Help it out by deleting the > + # leading directories ourselves. Super slow, but, what else > + # can we do? Without this, we will hit a > + # error: Tests passed but test cleanup failed; aborting > + # so do this ugly manual cleanup... > + while test ! -f directory-random-file.txt; do > + name=$(ls -d directory*) && > + mv $name/* . && > + rmdir $name > + done Another thing: this not being a test_when_finished handler means it would not help after a test failure. Perhaps wrap it in a helper clean_deep_hierarchy () { rm -fr directory* || while test ! -f directory-random-file.txt do ... done } and call it from test_when_finished?
On Sat, May 8, 2021 at 3:13 AM Junio C Hamano <gitster@pobox.com> wrote: > > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > From: Elijah Newren <newren@gmail.com> > > > > PNPM is apparently creating deeply nested (but ignored) directory > > Sorry, but what's PNPM? a package manager; I'll use Philip Oakley's suggestion to make it more clear. > > structures; traversing them is costly performance-wise, unnecessary, and > > in some cases is even throwing warnings/errors because the paths are too > > long to handle on various platforms. Add a testcase that demonstrates > > this problem. > > > > Initial-test-by: Jason Gore <Jason.Gore@microsoft.com> > > Helped-by: brian m. carlson <sandals@crustytoothpaste.net> > > Signed-off-by: Elijah Newren <newren@gmail.com> > > --- > > t/t7300-clean.sh | 40 ++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 40 insertions(+) > > > > diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh > > index a74816ca8b46..5f1dc397c11e 100755 > > --- a/t/t7300-clean.sh > > +++ b/t/t7300-clean.sh > > @@ -746,4 +746,44 @@ test_expect_success 'clean untracked paths by pathspec' ' > > test_must_be_empty actual > > ' > > > > +test_expect_failure 'avoid traversing into ignored directories' ' > > + test_when_finished rm -f output error && > > + test_create_repo avoid-traversing-deep-hierarchy && > > + ( > > + cd avoid-traversing-deep-hierarchy && > > + > > + >directory-random-file.txt && > > + # Put this file under directory400/directory399/.../directory1/ > > + depth=400 && > > + for x in $(test_seq 1 $depth); do > > Style. Lose semicolon, have "do" on the next line on its own, > aligned with "for". Tip: you shouldn't need any semicolon other > than the doubled ones in case/esac in your shell script. Thanks. > > > + mkdir "tmpdirectory$x" && > > + mv directory* "tmpdirectory$x" && > > + mv "tmpdirectory$x" "directory$x" > > + done && > > + > > + git clean -ffdxn -e directory$depth >../output 2>../error && > > + > > + test_must_be_empty ../output && > > + # We especially do not want things like > > + # "warning: could not open directory " > > + # appearing in the error output. It is true that directories > > + # that are too long cannot be opened, but we should not be > > + # recursing into those directories anyway since the very first > > + # level is ignored. > > + test_must_be_empty ../error && > > + > > + # alpine-linux-musl fails to "rm -rf" a directory with such > > + # a deeply nested hierarchy. Help it out by deleting the > > + # leading directories ourselves. Super slow, but, what else > > + # can we do? Without this, we will hit a > > + # error: Tests passed but test cleanup failed; aborting > > + # so do this ugly manual cleanup... > > + while test ! -f directory-random-file.txt; do > > Ditto. Yep, sorry. > > + name=$(ls -d directory*) && > > + mv $name/* . && > > + rmdir $name > > + done > > Hmph, after seeing the discussion thread of v1, I was expecting to > see a helper in Perl that cd's down and then comes back up while > removing what is in its directory (and I expected something similar > for creation side we saw above). Hmm, I was a bit unsure of the alternative route I took in patches 7 and 8 (switching trace1 to trace2 in dir.c, then using it to get more statistics which would allow a much more shallow directory structure for this test). I wasn't sure if the strategy seemed acceptable, and I wanted people to be able to see the two schemes side-by-side, but if that alternative is acceptable, I want to move patch 7 to the front of the series, the code change parts of patch 8 as the second patch, and then squash the rest of patch 8 into this patch vastly simplifying this testcase and obsoleting everyone's comments on it. Maybe I should have just refactored the series that way anyway. I'll send a reroll that does that, and put all the [RFC] patches first.
On Sat, May 8, 2021 at 3:19 AM Junio C Hamano <gitster@pobox.com> wrote: > > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > + # alpine-linux-musl fails to "rm -rf" a directory with such > > + # a deeply nested hierarchy. Help it out by deleting the > > + # leading directories ourselves. Super slow, but, what else > > + # can we do? Without this, we will hit a > > + # error: Tests passed but test cleanup failed; aborting > > + # so do this ugly manual cleanup... > > + while test ! -f directory-random-file.txt; do > > + name=$(ls -d directory*) && > > + mv $name/* . && > > + rmdir $name > > + done > > Another thing: this not being a test_when_finished handler means it > would not help after a test failure. test failures are irrelevant here; this code is here to help test_done's directory cleanup, which only fires when all tests pass. But if I restructure the series, this whole section of code disappears. I'll do that...
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh index a74816ca8b46..5f1dc397c11e 100755 --- a/t/t7300-clean.sh +++ b/t/t7300-clean.sh @@ -746,4 +746,44 @@ test_expect_success 'clean untracked paths by pathspec' ' test_must_be_empty actual ' +test_expect_failure 'avoid traversing into ignored directories' ' + test_when_finished rm -f output error && + test_create_repo avoid-traversing-deep-hierarchy && + ( + cd avoid-traversing-deep-hierarchy && + + >directory-random-file.txt && + # Put this file under directory400/directory399/.../directory1/ + depth=400 && + for x in $(test_seq 1 $depth); do + mkdir "tmpdirectory$x" && + mv directory* "tmpdirectory$x" && + mv "tmpdirectory$x" "directory$x" + done && + + git clean -ffdxn -e directory$depth >../output 2>../error && + + test_must_be_empty ../output && + # We especially do not want things like + # "warning: could not open directory " + # appearing in the error output. It is true that directories + # that are too long cannot be opened, but we should not be + # recursing into those directories anyway since the very first + # level is ignored. + test_must_be_empty ../error && + + # alpine-linux-musl fails to "rm -rf" a directory with such + # a deeply nested hierarchy. Help it out by deleting the + # leading directories ourselves. Super slow, but, what else + # can we do? Without this, we will hit a + # error: Tests passed but test cleanup failed; aborting + # so do this ugly manual cleanup... + while test ! -f directory-random-file.txt; do + name=$(ls -d directory*) && + mv $name/* . && + rmdir $name + done + ) +' + test_done