Message ID | 20201005072102.GE2291074@coredump.intra.peff.net (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | forbidding symlinked .gitattributes and .gitignore | expand |
(+cc: Dscho for NTFS savvy) Jeff King wrote: > We have tests that cover various filesystem-specific spellings of > ".gitmodules", because we need to reliably identify that path for some > security checks. These are from dc2d9ba318 (is_{hfs,ntfs}_dotgitmodules: > add tests, 2018-05-12), with the actual code coming from e7cb0b4455 > (is_ntfs_dotgit: match other .git files, 2018-05-11) and 0fc333ba20 > (is_hfs_dotgit: match other .git files, 2018-05-02). > > Those latter two commits also added similar matching functions for > .gitattributes and .gitignore. These ended up not being used in the > final series, and are currently dead code. But in preparation for them > being used, let's make sure they actually work by throwing a few basic > checks at them. > > I didn't bother with the whole battery of tests that we cover for > .gitmodules. These functions are all based on the same generic matcher, > so it's sufficient to test most of the corner cases just once. Yeah, that's reasonable. > Note that the ntfs magic prefix names in the tests come from the > algorithm described in e7cb0b4455 (and are different for each file). Doesn't block this patch, but I'm curious: how hard would it be to make a test with an NTFS prerequisite that makes sure we got the magic prefix right? > Signed-off-by: Jeff King <peff@peff.net> > --- > t/helper/test-path-utils.c | 41 ++++++++++++++++++++++++++------------ > t/t0060-path-utils.sh | 20 +++++++++++++++++++ > 2 files changed, 48 insertions(+), 13 deletions(-) > > diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c > index 313a153209..9e253f8058 100644 > --- a/t/helper/test-path-utils.c > +++ b/t/helper/test-path-utils.c > @@ -172,9 +172,22 @@ static struct test_data dirname_data[] = { > { NULL, NULL } > }; > > -static int is_dotgitmodules(const char *path) > +static int check_dotgitx(const char *x, const char **argv, > + int (*is_hfs)(const char *), > + int (*is_ntfs)(const char *)) > { > - return is_hfs_dotgitmodules(path) || is_ntfs_dotgitmodules(path); > + int res = 0, expect = 1; > + for (; *argv; argv++) { > + if (!strcmp("--not", *argv)) > + expect = !expect; > + else if (expect != (is_hfs(*argv) || is_ntfs(*argv))) > + res = error("'%s' is %s.%s", *argv, > + expect ? "not " : "", x); > + else > + fprintf(stderr, "ok: '%s' is %s.%s\n", > + *argv, expect ? "" : "not ", x); micronit: extra space on the "res" line. This "if" cascade is a little hard to read, even though it does the right thing. Can we make it more explicit? E.g. if (!strcmp("--not", *argv)) { expect = !expect; continue; } actual = is_hfs(*argv) || is_ntfs(*argv); fprintf(stderr, "%s: '%s' is %s%s", expect == actual ? "ok" : "error", *argv, actual ? "" : "not ", x); if (expect != actual) res = -1; I think it's a little easier to read with either (a) the dot included in the 'x' parameter or (b) the entire ".git" missing from the 'x' parameter. [...] > index 56db5c8aba..b2e3cf3f4c 100755 > --- a/t/t0060-path-utils.sh > +++ b/t/t0060-path-utils.sh > @@ -468,6 +468,26 @@ test_expect_success 'match .gitmodules' ' > .gitmodules,:\$DATA > ' > > +test_expect_success 'match .gitattributes' ' > + test-tool path-utils is_dotgitattributes \ > + .gitattributes \ > + .git${u200c}attributes \ > + .Gitattributes \ > + .gitattributeS \ > + GITATT~1 \ > + GI7D29~1 > +' > + > +test_expect_success 'match .gitignore' ' > + test-tool path-utils is_dotgitignore \ > + .gitignore \ > + .git${u200c}ignore \ > + .Gitignore \ > + .gitignorE \ > + GITIGN~1 \ > + GI250A~1 > +' > + > test_expect_success MINGW 'is_valid_path() on Windows' ' > test-tool path-utils is_valid_path \ > win32 \ With whatever subset of the changes above makes sense, Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Thanks.
On Mon, Oct 05, 2020 at 01:03:53AM -0700, Jonathan Nieder wrote: > > Note that the ntfs magic prefix names in the tests come from the > > algorithm described in e7cb0b4455 (and are different for each file). > > Doesn't block this patch, but I'm curious: how hard would it be to make > a test with an NTFS prerequisite that makes sure we got the magic prefix > right? I suspect hard since Dscho punted on it in the original series. :) If I understand correctly, it would require having an NTFS filesystem, and generating 10,000+ files with a clashing prefix. > > + for (; *argv; argv++) { > > + if (!strcmp("--not", *argv)) > > + expect = !expect; > > + else if (expect != (is_hfs(*argv) || is_ntfs(*argv))) > > + res = error("'%s' is %s.%s", *argv, > > + expect ? "not " : "", x); > > + else > > + fprintf(stderr, "ok: '%s' is %s.%s\n", > > + *argv, expect ? "" : "not ", x); > > micronit: extra space on the "res" line. Thanks, fixed. > This "if" cascade is a little hard to read, even though it does the > right thing. Can we make it more explicit? E.g. This is directly moved from the existing code. I'd prefer to keep the overall structure intact to make that clear. > I think it's a little easier to read with either (a) the dot included > in the 'x' parameter or (b) the entire ".git" missing from the 'x' > parameter. Yeah, I agree that's worth doing. I took (b), as "dotgitx" implies that "x" is "modules", etc. I had originally planned to automatically turn "gitmodules" into "is_ntfs_dotgitmodules", too, but it required macros and string-pasting. So I decided it was a bit too ugly. :) -Peff
Hi Peff & Jonathan N, On Mon, 5 Oct 2020, Jeff King wrote: > On Mon, Oct 05, 2020 at 01:03:53AM -0700, Jonathan Nieder wrote: > > > > Note that the ntfs magic prefix names in the tests come from the > > > algorithm described in e7cb0b4455 (and are different for each file). > > > > Doesn't block this patch, but I'm curious: how hard would it be to make > > a test with an NTFS prerequisite that makes sure we got the magic prefix > > right? > > I suspect hard since Dscho punted on it in the original series. :) If I > understand correctly, it would require having an NTFS filesystem, and > generating 10,000+ files with a clashing prefix. It's not quite _as_ bad: you only need to generate 4 files with a clashing prefix and then the real one: -- snip -- me@work MINGW64 ~/repros/ntfs-short-names $ touch .gitattributes1 me@work MINGW64 ~/repros/ntfs-short-names $ touch .gitattributes2 me@work MINGW64 ~/repros/ntfs-short-names $ touch .gitattributes3 me@work MINGW64 ~/repros/ntfs-short-names $ touch .gitattributes4 me@work MINGW64 ~/repros/ntfs-short-names $ touch .gitattributes me@work MINGW64 ~/repros/ntfs-short-names $ touch .gitignore1 me@work MINGW64 ~/repros/ntfs-short-names $ touch .gitignore2 me@work MINGW64 ~/repros/ntfs-short-names $ touch .gitignore3 me@work MINGW64 ~/repros/ntfs-short-names $ touch .gitignore4 me@work MINGW64 ~/repros/ntfs-short-names $ touch .gitignore me@work MINGW64 ~/repros/ntfs-short-names $ cmd //c dir //x Volume in drive C is OSDisk Volume Serial Number is 5E6B-4E77 Directory of C:\Users\me\repros\ntfs-short-names 10/05/2020 11:11 PM <DIR> . 10/05/2020 11:11 PM <DIR> .. 10/05/2020 11:08 PM 0 GI7D29~1 .gitattributes 10/05/2020 11:08 PM 0 GITATT~1 .gitattributes1 10/05/2020 11:08 PM 0 GITATT~2 .gitattributes2 10/05/2020 11:08 PM 0 GITATT~3 .gitattributes3 10/05/2020 11:08 PM 0 GITATT~4 .gitattributes4 10/05/2020 11:11 PM 0 GI250A~1 .gitignore 10/05/2020 11:11 PM 0 GITIGN~1 .gitignore1 10/05/2020 11:11 PM 0 GITIGN~2 .gitignore2 10/05/2020 11:11 PM 0 GITIGN~3 .gitignore3 10/05/2020 11:11 PM 0 GITIGN~4 .gitignore4 10 File(s) 0 bytes 2 Dir(s) 314,658,705,408 bytes free -- snap -- But I don't necessarily think that it would make sense to add that test: it adds churn _every_ time the regression test is run, and by deity, it sure takes way too long on Windows _already_, and the test would be for a regression _in the NTFS driver_. At this stage, I also highly doubt that the algorithm will change ever again (the last time it changed was several Windows versions ago, I want to say in Windows XP, but it could have been all the way back to NT). In light of that, I'd say that the bang is rather small and the buck would be not small at all, and would have to be paid by developers on Windows who already pay a disproportionately high price when running the test suite, so... Ciao, Dscho
On Mon, Oct 05, 2020 at 11:20:48PM +0200, Johannes Schindelin wrote: > > > > Note that the ntfs magic prefix names in the tests come from the > > > > algorithm described in e7cb0b4455 (and are different for each file). > > > > > > Doesn't block this patch, but I'm curious: how hard would it be to make > > > a test with an NTFS prerequisite that makes sure we got the magic prefix > > > right? > > > > I suspect hard since Dscho punted on it in the original series. :) If I > > understand correctly, it would require having an NTFS filesystem, and > > generating 10,000+ files with a clashing prefix. > > It's not quite _as_ bad: you only need to generate 4 files with a clashing > prefix and then the real one: Ah, that really isn't that bad, then. Still, I don't mind leaving this as-is under the notion that if the algorithm does change, it would likely make it onto your radar anyway (or the radar of _anybody_ who would raise the issue). -Peff
diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c index 313a153209..9e253f8058 100644 --- a/t/helper/test-path-utils.c +++ b/t/helper/test-path-utils.c @@ -172,9 +172,22 @@ static struct test_data dirname_data[] = { { NULL, NULL } }; -static int is_dotgitmodules(const char *path) +static int check_dotgitx(const char *x, const char **argv, + int (*is_hfs)(const char *), + int (*is_ntfs)(const char *)) { - return is_hfs_dotgitmodules(path) || is_ntfs_dotgitmodules(path); + int res = 0, expect = 1; + for (; *argv; argv++) { + if (!strcmp("--not", *argv)) + expect = !expect; + else if (expect != (is_hfs(*argv) || is_ntfs(*argv))) + res = error("'%s' is %s.%s", *argv, + expect ? "not " : "", x); + else + fprintf(stderr, "ok: '%s' is %s.%s\n", + *argv, expect ? "" : "not ", x); + } + return !!res; } static int cmp_by_st_size(const void *a, const void *b) @@ -382,17 +395,19 @@ int cmd__path_utils(int argc, const char **argv) return test_function(dirname_data, posix_dirname, argv[1]); if (argc > 2 && !strcmp(argv[1], "is_dotgitmodules")) { - int res = 0, expect = 1, i; - for (i = 2; i < argc; i++) - if (!strcmp("--not", argv[i])) - expect = !expect; - else if (expect != is_dotgitmodules(argv[i])) - res = error("'%s' is %s.gitmodules", argv[i], - expect ? "not " : ""); - else - fprintf(stderr, "ok: '%s' is %s.gitmodules\n", - argv[i], expect ? "" : "not "); - return !!res; + return check_dotgitx("gitmodules", argv + 2, + is_hfs_dotgitmodules, + is_ntfs_dotgitmodules); + } + if (argc > 2 && !strcmp(argv[1], "is_dotgitignore")) { + return check_dotgitx("gitignore", argv + 2, + is_hfs_dotgitignore, + is_ntfs_dotgitignore); + } + if (argc > 2 && !strcmp(argv[1], "is_dotgitattributes")) { + return check_dotgitx("gitattributes", argv + 2, + is_hfs_dotgitattributes, + is_ntfs_dotgitattributes); } if (argc > 2 && !strcmp(argv[1], "file-size")) { diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 56db5c8aba..b2e3cf3f4c 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -468,6 +468,26 @@ test_expect_success 'match .gitmodules' ' .gitmodules,:\$DATA ' +test_expect_success 'match .gitattributes' ' + test-tool path-utils is_dotgitattributes \ + .gitattributes \ + .git${u200c}attributes \ + .Gitattributes \ + .gitattributeS \ + GITATT~1 \ + GI7D29~1 +' + +test_expect_success 'match .gitignore' ' + test-tool path-utils is_dotgitignore \ + .gitignore \ + .git${u200c}ignore \ + .Gitignore \ + .gitignorE \ + GITIGN~1 \ + GI250A~1 +' + test_expect_success MINGW 'is_valid_path() on Windows' ' test-tool path-utils is_valid_path \ win32 \
We have tests that cover various filesystem-specific spellings of ".gitmodules", because we need to reliably identify that path for some security checks. These are from dc2d9ba318 (is_{hfs,ntfs}_dotgitmodules: add tests, 2018-05-12), with the actual code coming from e7cb0b4455 (is_ntfs_dotgit: match other .git files, 2018-05-11) and 0fc333ba20 (is_hfs_dotgit: match other .git files, 2018-05-02). Those latter two commits also added similar matching functions for .gitattributes and .gitignore. These ended up not being used in the final series, and are currently dead code. But in preparation for them being used, let's make sure they actually work by throwing a few basic checks at them. I didn't bother with the whole battery of tests that we cover for .gitmodules. These functions are all based on the same generic matcher, so it's sufficient to test most of the corner cases just once. Note that the ntfs magic prefix names in the tests come from the algorithm described in e7cb0b4455 (and are different for each file). Signed-off-by: Jeff King <peff@peff.net> --- t/helper/test-path-utils.c | 41 ++++++++++++++++++++++++++------------ t/t0060-path-utils.sh | 20 +++++++++++++++++++ 2 files changed, 48 insertions(+), 13 deletions(-)