Message ID | 20220528231118.3504387-3-gitster@pobox.com (mailing list archive) |
---|---|
State | Accepted |
Commit | de1f68a968e64b3e1e2979222238fec1f045bbf3 |
Headers | show |
Series | js/scalar-diagnose rebased | expand |
On Sat, May 28, 2022 at 04:11:13PM -0700, Junio C Hamano wrote: > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > By allowing the path to be enclosed in double-quotes, we can avoid > the limitation that paths cannot contain colons. > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > * Tightened shell variable quoting > > <snip> > > diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh > index d6027189e2..3992d08158 100755 > --- a/t/t5003-archive-zip.sh > +++ b/t/t5003-archive-zip.sh > @@ -207,13 +207,21 @@ check_zip with_untracked > check_added with_untracked untracked untracked > > test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' ' > + if test_have_prereq FUNNYNAMES > + then > + PATHNAME="pathname with : colon" > + else > + PATHNAME="pathname without colon" > + fi && > git archive --format=zip >with_file_with_content.zip \ > + --add-virtual-file=\""$PATHNAME"\": \ > --add-virtual-file=hello:world $EMPTY_TREE && > test_when_finished "rm -rf tmp-unpack" && > mkdir tmp-unpack && ( > cd tmp-unpack && > "$GIT_UNZIP" ../with_file_with_content.zip && > test_path_is_file hello && > + test_path_is_file "$PATHNAME" && > test world = $(cat hello) > ) > ' This test is currently failing on Cygwin: it looks like it's exposing a bug in Cygwin that means files with colons in their name aren't correctly extracted from zip archives. I'm going to report that to the Cygwin mailing list, but I wanted to note it for the record here, too. Adam
Adam Dinwoodie <adam@dinwoodie.org> writes: >> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh >> index d6027189e2..3992d08158 100755 >> --- a/t/t5003-archive-zip.sh >> +++ b/t/t5003-archive-zip.sh >> @@ -207,13 +207,21 @@ check_zip with_untracked >> check_added with_untracked untracked untracked >> >> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' ' >> + if test_have_prereq FUNNYNAMES >> + then >> + PATHNAME="pathname with : colon" >> + else >> + PATHNAME="pathname without colon" >> + fi && >> git archive --format=zip >with_file_with_content.zip \ >> + --add-virtual-file=\""$PATHNAME"\": \ >> --add-virtual-file=hello:world $EMPTY_TREE && >> test_when_finished "rm -rf tmp-unpack" && >> mkdir tmp-unpack && ( >> cd tmp-unpack && >> "$GIT_UNZIP" ../with_file_with_content.zip && >> test_path_is_file hello && >> + test_path_is_file "$PATHNAME" && >> test world = $(cat hello) >> ) >> ' > > This test is currently failing on Cygwin: it looks like it's exposing a > bug in Cygwin that means files with colons in their name aren't > correctly extracted from zip archives. I'm going to report that to the > Cygwin mailing list, but I wanted to note it for the record here, too. Does this mean that our code to set FUNNYNAMES prerequiste is slightly broken? IOW, should we check with a path with a colon in it, as well as whatever we use currently for FUNNYNAMES? Something like the attached patch? Or does Cygwin otherwise work perfectly well with a path with a colon in it, but only $GIT_UNZIP command has problem with it? If that is the case, then please disregard the attached. Thanks. t/test-lib.sh | 1 + 1 file changed, 1 insertion(+) diff --git i/t/test-lib.sh w/t/test-lib.sh index 55857af601..5dce7d95c9 100644 --- i/t/test-lib.sh +++ w/t/test-lib.sh @@ -1620,6 +1620,7 @@ test_lazy_prereq FUNNYNAMES ' touch -- \ "FUNNYNAMES tab embedded" \ "FUNNYNAMES \"quote embedded\"" \ + "FUNNYNAMES colon : embedded" \ "FUNNYNAMES newline embedded" 2>/dev/null && rm -- \
On Wed, Jun 15, 2022 at 01:00:07PM -0700, Junio C Hamano wrote: > Adam Dinwoodie <adam@dinwoodie.org> writes: > > >> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh > >> index d6027189e2..3992d08158 100755 > >> --- a/t/t5003-archive-zip.sh > >> +++ b/t/t5003-archive-zip.sh > >> @@ -207,13 +207,21 @@ check_zip with_untracked > >> check_added with_untracked untracked untracked > >> > >> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' ' > >> + if test_have_prereq FUNNYNAMES > >> + then > >> + PATHNAME="pathname with : colon" > >> + else > >> + PATHNAME="pathname without colon" > >> + fi && > >> git archive --format=zip >with_file_with_content.zip \ > >> + --add-virtual-file=\""$PATHNAME"\": \ > >> --add-virtual-file=hello:world $EMPTY_TREE && > >> test_when_finished "rm -rf tmp-unpack" && > >> mkdir tmp-unpack && ( > >> cd tmp-unpack && > >> "$GIT_UNZIP" ../with_file_with_content.zip && > >> test_path_is_file hello && > >> + test_path_is_file "$PATHNAME" && > >> test world = $(cat hello) > >> ) > >> ' > > > > This test is currently failing on Cygwin: it looks like it's exposing a > > bug in Cygwin that means files with colons in their name aren't > > correctly extracted from zip archives. I'm going to report that to the > > Cygwin mailing list, but I wanted to note it for the record here, too. > > Does this mean that our code to set FUNNYNAMES prerequiste is > slightly broken? IOW, should we check with a path with a colon in > it, as well as whatever we use currently for FUNNYNAMES? > > Something like the attached patch? > > Or does Cygwin otherwise work perfectly well with a path with a > colon in it, but only $GIT_UNZIP command has problem with it? If > that is the case, then please disregard the attached. The latter: Cygwin works perfectly with paths containing colons, except that Cygwin's `unzip` is seemingly buggy and doesn't work. The file systems Cygwin runs on don't support colons in paths, but Cygwin hides that problem by rewriting ASCII colons to some high Unicode code point on the filesystem, meaning Cygwin-native applications see a regular colon, while Windows-native applications see an unusual but perfectly valid Unicode character. I tested the same patch to FUNNYNAMES myself before reporting, and the test fails exactly the same way. If we wanted to catch this, I think we'd need a test that explicitly attempted to unzip an archive containing a path with a colon. (The code to set FUNNYNAMES *is* slightly broken, per the discussions around 6d340dfaef ("t9902: split test to run on appropriate systems", 2022-04-08), and my to-do list still features tidying up and resubmitting the patch Ævar wrote in that discussion thread. But it wouldn't help here because this issue is specific to Cygwin's `unzip`, rather than a general limitation of running on Cygwin.)
Hi Adam, On Wed, 15 Jun 2022, Adam Dinwoodie wrote: > On Wed, Jun 15, 2022 at 01:00:07PM -0700, Junio C Hamano wrote: > > Adam Dinwoodie <adam@dinwoodie.org> writes: > > > > >> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh > > >> index d6027189e2..3992d08158 100755 > > >> --- a/t/t5003-archive-zip.sh > > >> +++ b/t/t5003-archive-zip.sh > > >> @@ -207,13 +207,21 @@ check_zip with_untracked > > >> check_added with_untracked untracked untracked > > >> > > >> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' ' > > >> + if test_have_prereq FUNNYNAMES > > >> + then > > >> + PATHNAME="pathname with : colon" > > >> + else > > >> + PATHNAME="pathname without colon" > > >> + fi && > > >> git archive --format=zip >with_file_with_content.zip \ > > >> + --add-virtual-file=\""$PATHNAME"\": \ > > >> --add-virtual-file=hello:world $EMPTY_TREE && > > >> test_when_finished "rm -rf tmp-unpack" && > > >> mkdir tmp-unpack && ( > > >> cd tmp-unpack && > > >> "$GIT_UNZIP" ../with_file_with_content.zip && > > >> test_path_is_file hello && > > >> + test_path_is_file "$PATHNAME" && > > >> test world = $(cat hello) > > >> ) > > >> ' > > > > > > This test is currently failing on Cygwin: it looks like it's exposing a > > > bug in Cygwin that means files with colons in their name aren't > > > correctly extracted from zip archives. I'm going to report that to the > > > Cygwin mailing list, but I wanted to note it for the record here, too. > > > > Does this mean that our code to set FUNNYNAMES prerequiste is > > slightly broken? IOW, should we check with a path with a colon in > > it, as well as whatever we use currently for FUNNYNAMES? > > > > Something like the attached patch? > > > > Or does Cygwin otherwise work perfectly well with a path with a > > colon in it, but only $GIT_UNZIP command has problem with it? If > > that is the case, then please disregard the attached. > > The latter: Cygwin works perfectly with paths containing colons, except > that Cygwin's `unzip` is seemingly buggy and doesn't work. The file > systems Cygwin runs on don't support colons in paths, but Cygwin hides > that problem by rewriting ASCII colons to some high Unicode code point > on the filesystem, Let me throw in a bit more detail: The forbidden characters are mapped into the Unicode page U+f0XX, which is supposed to be used "for private purposes". Even more detail can be found here: https://github.com/cygwin/cygwin/blob/cygwin-3_3_5-release/winsup/cygwin/strfuncs.cc#L19-L23 > meaning Cygwin-native applications see a regular colon, while > Windows-native applications see an unusual but perfectly valid Unicode > character. Now, I have two questions: - Why does `unzip` not use Cygwin's regular functions (which should all be aware of that U+f0XX <-> U+00XX mapping)? - Even more importantly: would the test case pass if we simply used another forbidden character, such as `?` or `*`? > I tested the same patch to FUNNYNAMES myself before reporting, and the > test fails exactly the same way. If we wanted to catch this, I think > we'd need a test that explicitly attempted to unzip an archive > containing a path with a colon. > > (The code to set FUNNYNAMES *is* slightly broken, per the discussions > around 6d340dfaef ("t9902: split test to run on appropriate systems", > 2022-04-08), and my to-do list still features tidying up and > resubmitting the patch Ævar wrote in that discussion thread. But it > wouldn't help here because this issue is specific to Cygwin's `unzip`, > rather than a general limitation of running on Cygwin.) I'd rather avoid changing FUNNYNAMES at this stage, if we can help it. Thanks, Dscho
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> I'd rather avoid changing FUNNYNAMES at this stage, if we can help it.
I wonder if it is sufficient to ask "unzip -l" the names of the
files in the archive, without having to materialize these files on
the filesystem. Would that bypass the whole FUNNYNAMES business, or
is "unzip" paranoid enough to reject an archive, even when it is not
extracting into the local filesystem, with a path that it would not
be able to extract if it were asked to?
I do not know how standardized different implementations of "unzip"
is, and how similar output "unzip -l" implementations produce are,
but the following seems to pass for me locally.
t/t5003-archive-zip.sh | 18 ++++--------------
1 file changed, 4 insertions(+), 14 deletions(-)
diff --git c/t/t5003-archive-zip.sh w/t/t5003-archive-zip.sh
index 3992d08158..f2fdf2c235 100755
--- c/t/t5003-archive-zip.sh
+++ w/t/t5003-archive-zip.sh
@@ -207,23 +207,13 @@ check_zip with_untracked
check_added with_untracked untracked untracked
test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
- if test_have_prereq FUNNYNAMES
- then
- PATHNAME="pathname with : colon"
- else
- PATHNAME="pathname without colon"
- fi &&
+ PATHNAME="pathname with : colon" &&
git archive --format=zip >with_file_with_content.zip \
--add-virtual-file=\""$PATHNAME"\": \
--add-virtual-file=hello:world $EMPTY_TREE &&
- test_when_finished "rm -rf tmp-unpack" &&
- mkdir tmp-unpack && (
- cd tmp-unpack &&
- "$GIT_UNZIP" ../with_file_with_content.zip &&
- test_path_is_file hello &&
- test_path_is_file "$PATHNAME" &&
- test world = $(cat hello)
- )
+ "$GIT_UNZIP" -l with_file_with_content.zip >toc &&
+ grep -e " $PATHNAME\$" toc &&
+ grep -e " hello\$" toc
'
test_expect_success 'git archive --format=zip --add-file twice' '
On Sat, Jun 18, 2022 at 10:19:28PM +0200, Johannes Schindelin wrote: > Hi Adam, > > On Wed, 15 Jun 2022, Adam Dinwoodie wrote: > > > On Wed, Jun 15, 2022 at 01:00:07PM -0700, Junio C Hamano wrote: > > > Adam Dinwoodie <adam@dinwoodie.org> writes: > > > > > > >> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh > > > >> index d6027189e2..3992d08158 100755 > > > >> --- a/t/t5003-archive-zip.sh > > > >> +++ b/t/t5003-archive-zip.sh > > > >> @@ -207,13 +207,21 @@ check_zip with_untracked > > > >> check_added with_untracked untracked untracked > > > >> > > > >> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' ' > > > >> + if test_have_prereq FUNNYNAMES > > > >> + then > > > >> + PATHNAME="pathname with : colon" > > > >> + else > > > >> + PATHNAME="pathname without colon" > > > >> + fi && > > > >> git archive --format=zip >with_file_with_content.zip \ > > > >> + --add-virtual-file=\""$PATHNAME"\": \ > > > >> --add-virtual-file=hello:world $EMPTY_TREE && > > > >> test_when_finished "rm -rf tmp-unpack" && > > > >> mkdir tmp-unpack && ( > > > >> cd tmp-unpack && > > > >> "$GIT_UNZIP" ../with_file_with_content.zip && > > > >> test_path_is_file hello && > > > >> + test_path_is_file "$PATHNAME" && > > > >> test world = $(cat hello) > > > >> ) > > > >> ' > > > > > > > > This test is currently failing on Cygwin: it looks like it's exposing a > > > > bug in Cygwin that means files with colons in their name aren't > > > > correctly extracted from zip archives. I'm going to report that to the > > > > Cygwin mailing list, but I wanted to note it for the record here, too. > > > > > > Does this mean that our code to set FUNNYNAMES prerequiste is > > > slightly broken? IOW, should we check with a path with a colon in > > > it, as well as whatever we use currently for FUNNYNAMES? > > > > > > Something like the attached patch? > > > > > > Or does Cygwin otherwise work perfectly well with a path with a > > > colon in it, but only $GIT_UNZIP command has problem with it? If > > > that is the case, then please disregard the attached. > > > > The latter: Cygwin works perfectly with paths containing colons, except > > that Cygwin's `unzip` is seemingly buggy and doesn't work. The file > > systems Cygwin runs on don't support colons in paths, but Cygwin hides > > that problem by rewriting ASCII colons to some high Unicode code point > > on the filesystem, > > Let me throw in a bit more detail: The forbidden characters are mapped > into the Unicode page U+f0XX, which is supposed to be used "for private > purposes". Even more detail can be found here: > https://github.com/cygwin/cygwin/blob/cygwin-3_3_5-release/winsup/cygwin/strfuncs.cc#L19-L23 > > > meaning Cygwin-native applications see a regular colon, while > > Windows-native applications see an unusual but perfectly valid Unicode > > character. > > Now, I have two questions: > > - Why does `unzip` not use Cygwin's regular functions (which should all be > aware of that U+f0XX <-> U+00XX mapping)? That is an excellent question! This behaviour came from an `#ifdef __CYGWIN__` in the upstream unzip package; with that #ifdef removed, everything works as expected. The folk on the Cygwin mailing list had no idea *why* that #ifdef was there, given it's evidently unnecessary; my best guess is that it was added a long time ago before Cygwin could handle those characters in the general case. Since my report, the Cygwin package has picked up a new maintainer who has released a version of the unzip package with that #ifdef removed, so this test is now passing. > - Even more importantly: would the test case pass if we simply used > another forbidden character, such as `?` or `*`? The set of characters that had special handling in unzip was "*:?|<> all of which are handled appropriately by Cygwin applications in general, and all of which had this unnecessary handling in `unzip` > > I tested the same patch to FUNNYNAMES myself before reporting, and the > > test fails exactly the same way. If we wanted to catch this, I think > > we'd need a test that explicitly attempted to unzip an archive > > containing a path with a colon. > > > > (The code to set FUNNYNAMES *is* slightly broken, per the discussions > > around 6d340dfaef ("t9902: split test to run on appropriate systems", > > 2022-04-08), and my to-do list still features tidying up and > > resubmitting the patch Ævar wrote in that discussion thread. But it > > wouldn't help here because this issue is specific to Cygwin's `unzip`, > > rather than a general limitation of running on Cygwin.) > > I'd rather avoid changing FUNNYNAMES at this stage, if we can help it. Oh yes, I definitely wasn't proposing changing things for 2.37.0! I just wanted to acknowledge that there is a known issue here that has been discussed on this list previously, that we (I) would hopefully get around to fixing at some point. Adam
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt index b41cc5bc2e..56989a2f34 100644 --- a/Documentation/git-archive.txt +++ b/Documentation/git-archive.txt @@ -69,10 +69,16 @@ OPTIONS by concatenating the value of the last `--prefix` option (if any) before this `--add-virtual-file` and `<path>`. + -The `<path>` cannot contain any colon, the file mode is limited to -a regular file, and the option may be subject to platform-dependent -command-line limits. For non-trivial cases, write an untracked file -and use `--add-file` instead. +The `<path>` argument can start and end with a literal double-quote +character; the contained file name is interpreted as a C-style string, +i.e. the backslash is interpreted as escape character. The path must +be quoted if it contains a colon, to avoid the colon from being +misinterpreted as the separator between the path and the contents, or +if the path begins or ends with a double-quote character. ++ +The file mode is limited to a regular file, and the option may be +subject to platform-dependent command-line limits. For non-trivial +cases, write an untracked file and use `--add-file` instead. --worktree-attributes:: Look for attributes in .gitattributes files in the working tree diff --git a/archive.c b/archive.c index d26f4ef945..48aba4ac46 100644 --- a/archive.c +++ b/archive.c @@ -9,6 +9,7 @@ #include "parse-options.h" #include "unpack-trees.h" #include "dir.h" +#include "quote.h" static char const * const archive_usage[] = { N_("git archive [<options>] <tree-ish> [<path>...]"), @@ -535,22 +536,31 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset) die(_("Not a regular file: %s"), path); info->content = NULL; /* read the file later */ } else if (!strcmp(opt->long_name, "add-virtual-file")) { - const char *colon = strchr(arg, ':'); - char *p; + struct strbuf buf = STRBUF_INIT; + const char *p = arg; + + if (*p != '"') + p = strchr(p, ':'); + else if (unquote_c_style(&buf, p, &p) < 0) + die(_("unclosed quote: '%s'"), arg); - if (!colon) + if (!p || *p != ':') die(_("missing colon: '%s'"), arg); - p = xstrndup(arg, colon - arg); - if (!args->prefix) - path = p; - else { - path = prefix_filename(args->prefix, p); - free(p); + if (p == arg) + die(_("empty file name: '%s'"), arg); + + path = buf.len ? + strbuf_detach(&buf, NULL) : xstrndup(arg, p - arg); + + if (args->prefix) { + char *save = path; + path = prefix_filename(args->prefix, path); + free(save); } memset(&info->stat, 0, sizeof(info->stat)); info->stat.st_mode = S_IFREG | 0644; - info->content = xstrdup(colon + 1); + info->content = xstrdup(p + 1); info->stat.st_size = strlen(info->content); } else { BUG("add_file_cb() called for %s", opt->long_name); diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh index d6027189e2..3992d08158 100755 --- a/t/t5003-archive-zip.sh +++ b/t/t5003-archive-zip.sh @@ -207,13 +207,21 @@ check_zip with_untracked check_added with_untracked untracked untracked test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' ' + if test_have_prereq FUNNYNAMES + then + PATHNAME="pathname with : colon" + else + PATHNAME="pathname without colon" + fi && git archive --format=zip >with_file_with_content.zip \ + --add-virtual-file=\""$PATHNAME"\": \ --add-virtual-file=hello:world $EMPTY_TREE && test_when_finished "rm -rf tmp-unpack" && mkdir tmp-unpack && ( cd tmp-unpack && "$GIT_UNZIP" ../with_file_with_content.zip && test_path_is_file hello && + test_path_is_file "$PATHNAME" && test world = $(cat hello) ) '