Message ID | 7eebcf27b45eb13541d4abae70a374a0e35ab6b8.1653145696.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | scalar: implement the subcommand "diagnose" | expand |
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes: > test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' ' > + if test_have_prereq FUNNYNAMES > + then > + PATHNAME=quoted:colon > + else > + PATHNAME=quoted > + fi && > git archive --format=zip >with_file_with_content.zip \ > + --add-virtual-file=\"$PATHNAME\": \ The name is better, but this still limits what can be in PATHNAME. Write either one of these: --add-virtual-file="\"$PATHNAME\":" \ --add-virtual-file=\""$PATHNAME"\": \ to signal the intention better to future readers. We are showing an explicit dq-pair we want to pass to the c-unquote machinery, and we are showing that we are not being unnecessarily loose by protecting the string from getting word split. Either is fine, but leaving it unquoted is not. > + test_path_is_file $PATHNAME && Ditto. There is no reason to forbid future developers from futzing the test to include space in the PATHNAME variable. IOW, I want us to be better than saying I know there is no $IFS whitespace now because I just wrote it. Because I do not think there is any need to test with a string with whitespace in it, I will leave the variable unquoted. Anybody who changes the variable and breaks this assumption have only themselves to blame for breaking the tests. It is not my fault and it is not my problem. which is the signal our readers would get from this patch (I would, if I were reading this commit as a third-party), especially once they become aware of the fact that this exact issue was already pointed out during the review discussion. Using double-quote appropriately sends a strong signal to reviewers and future developers that we care about details. A valid alternative is to write the assumption out where we currently assign to PATHNAME. # The PATHNAME variable is used without quote in the code # below for such and such reasons, so you cannot use a $IFS # whitespace in it. if test_have_prereq FUNNYNAMES then ... If the "defensive" measure that is necessary to avoid a limitation is too onerous, such an approach may be very much more preferrable than preparing for future changes. "for such and such reasons" is a good place to justify why we avoid unnecessarily complex defensive measure and restrict future changes in the documented way. But in _this_ particular case, the "defensive" measure necessary is merely just to quote the shell variables properly, which nobody sensible would say too onerous. I couldn't come up with anything remotely plausible to fill "for such and such reasons" myself when I tried to justify leaving the variables unquoted. Regardless of the quoting issue, we probably want to comment on what value exactly is in PATHNAME before the assignment, by the way. E.g. # The PATHNAME variable holds a filename encoded like a # string constant in C language (e.g. "\060" is digit "0") if test_have_prereq FUNNYNAMES then PATHNAME=quoted:colon:\\060zero else PATHNAME=quoted\\060zero fi That would not just protect only one aspect (i.e. we can pass a colon into the resulting filename) this change but the path goes through the c-unquoting rules. Thanks.
Junio C Hamano <gitster@pobox.com> writes: > But in _this_ particular case, the "defensive" measure necessary is > merely just to quote the shell variables properly, which nobody > sensible would say too onerous. I couldn't come up with anything > remotely plausible to fill "for such and such reasons" myself when I > tried to justify leaving the variables unquoted. > > Regardless of the quoting issue, we probably want to comment on what > value exactly is in PATHNAME before the assignment, by the way. > > E.g. > > # The PATHNAME variable holds a filename encoded like a > # string constant in C language (e.g. "\060" is digit "0") > if test_have_prereq FUNNYNAMES > then > PATHNAME=quoted:colon:\\060zero > else > PATHNAME=quoted\\060zero > fi > > That would not just protect only one aspect (i.e. we can pass a > colon into the resulting filename) this change but the path goes > through the c-unquoting rules. Actually, I _think_ that pushes us beyond the "reasonably defensive for the current need". We'd need to prepare how the pathname is expected to be unquoted for the later test test_path_is_file "$PATHNAME" to work. So here is what I queued as a fixup for this step on top of the series. t/t5003-archive-zip.sh | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git c/t/t5003-archive-zip.sh w/t/t5003-archive-zip.sh index 3a5a052e8c..6addb6c684 100755 --- c/t/t5003-archive-zip.sh +++ w/t/t5003-archive-zip.sh @@ -209,19 +209,19 @@ check_added with_untracked untracked untracked test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' ' if test_have_prereq FUNNYNAMES then - PATHNAME=quoted:colon + PATHNAME="pathname with : colon" else - PATHNAME=quoted + PATHNAME="pathname without colon" fi && git archive --format=zip >with_file_with_content.zip \ - --add-virtual-file=\"$PATHNAME\": \ + --add-virtual-file=\""$PATHNAME"\": \ --add-virtual-file=hello:world $EMPTY_TREE && test_when_finished "rm -rf tmp-unpack" && mkdir tmp-unpack && ( cd tmp-unpack && "$GIT_UNZIP" ../with_file_with_content.zip && test_path_is_file hello && - test_path_is_file $PATHNAME && + test_path_is_file "$PATHNAME" && test world = $(cat hello) ) '
Junio C Hamano <gitster@pobox.com> writes: >> # The PATHNAME variable holds a filename encoded like a >> # string constant in C language (e.g. "\060" is digit "0") >> if test_have_prereq FUNNYNAMES >> then >> PATHNAME=quoted:colon:\\060zero >> ... > Actually, I _think_ that pushes us beyond the "reasonably defensive > for the current need". We'd need to prepare how the pathname is > expected to be unquoted for the later test > > test_path_is_file "$PATHNAME" > > to work. IOW, I would need to add a new test-tool (attached) and then start this test like so: if ... then PATHNAME=quoted:colon:\\060zero else PATHNAME=quoted\\060zero fi UQPATHNAME=$(test-tool unquote-c-style \""$PATHNAME"\") and change the last test to test_path_is_file "$UQPATHNAME" if we really wanted to test that the the PATHNAME is treated as a c-style quoted string. I am on the fence. We do not have an immediate need, in the sense that nobody needs to encode "0" as "\060" and trigger the unquote codepath in real life. But it does feel prudent to make sure we can grok C-quoted pathname as we claim in the documentation. And the resulting change to the test does not look _too_ bad (and the new test-tool certainly does not hurt, either). So... Makefile | 1 + t/helper/test-quoted.c | 34 ++++++++++++++++++++++++++++++++++ t/helper/test-tool.c | 2 ++ t/helper/test-tool.h | 2 ++ 4 files changed, 39 insertions(+) diff --git c/Makefile w/Makefile index 298becd5a5..1d544ad46a 100644 --- c/Makefile +++ w/Makefile @@ -749,6 +749,7 @@ TEST_BUILTINS_OBJS += test-pkt-line.o TEST_BUILTINS_OBJS += test-prio-queue.o TEST_BUILTINS_OBJS += test-proc-receive.o TEST_BUILTINS_OBJS += test-progress.o +TEST_BUILTINS_OBJS += test-quoted.o TEST_BUILTINS_OBJS += test-reach.o TEST_BUILTINS_OBJS += test-read-cache.o TEST_BUILTINS_OBJS += test-read-graph.o diff --git c/t/helper/test-quoted.c w/t/helper/test-quoted.c new file mode 100644 index 0000000000..15baa55e43 --- /dev/null +++ w/t/helper/test-quoted.c @@ -0,0 +1,34 @@ +#include "test-tool.h" +#include "cache.h" +#include "quote.h" + +int cmd__unquote_c_style(int argc, const char **argv) +{ + struct strbuf buf = STRBUF_INIT; + + while (*++argv) { + const char *p = *argv; + + if (unquote_c_style(&buf, p, &p) < 0) + error("cannot unquote '%s'", *argv); + else + printf("%s\n", buf.buf); + strbuf_reset(&buf); + } + return 0; +} + +int cmd__quote_c_style(int argc, const char **argv) +{ + struct strbuf buf = STRBUF_INIT; + + while (*++argv) { + const char *p = *argv; + + quote_c_style(p, &buf, NULL, 0); + printf("%s\n", buf.buf); + strbuf_reset(&buf); + } + return 0; +} + diff --git c/t/helper/test-tool.c w/t/helper/test-tool.c index d2eacd302d..5633c98569 100644 --- c/t/helper/test-tool.c +++ w/t/helper/test-tool.c @@ -58,6 +58,7 @@ static struct test_cmd cmds[] = { { "prio-queue", cmd__prio_queue }, { "proc-receive", cmd__proc_receive }, { "progress", cmd__progress }, + { "quote-c-style", cmd__quote_c_style }, { "reach", cmd__reach }, { "read-cache", cmd__read_cache }, { "read-graph", cmd__read_graph }, @@ -81,6 +82,7 @@ static struct test_cmd cmds[] = { { "submodule-nested-repo-config", cmd__submodule_nested_repo_config }, { "subprocess", cmd__subprocess }, { "trace2", cmd__trace2 }, + { "unquote-c-style", cmd__unquote_c_style }, { "userdiff", cmd__userdiff }, { "urlmatch-normalization", cmd__urlmatch_normalization }, { "xml-encode", cmd__xml_encode }, diff --git c/t/helper/test-tool.h w/t/helper/test-tool.h index 960cc27ef7..f5e8929009 100644 --- c/t/helper/test-tool.h +++ w/t/helper/test-tool.h @@ -48,6 +48,7 @@ int cmd__pkt_line(int argc, const char **argv); int cmd__prio_queue(int argc, const char **argv); int cmd__proc_receive(int argc, const char **argv); int cmd__progress(int argc, const char **argv); +int cmd__quote_c_style(int argc, const char **argv); int cmd__reach(int argc, const char **argv); int cmd__read_cache(int argc, const char **argv); int cmd__read_graph(int argc, const char **argv); @@ -71,6 +72,7 @@ int cmd__submodule_config(int argc, const char **argv); int cmd__submodule_nested_repo_config(int argc, const char **argv); int cmd__subprocess(int argc, const char **argv); int cmd__trace2(int argc, const char **argv); +int cmd__unquote_c_style(int argc, const char **argv); int cmd__userdiff(int argc, const char **argv); int cmd__urlmatch_normalization(int argc, const char **argv); int cmd__xml_encode(int argc, const char **argv);
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt index 893cb1075bf..54de945a84e 100644 --- a/Documentation/git-archive.txt +++ b/Documentation/git-archive.txt @@ -67,10 +67,16 @@ OPTIONS by concatenating the value for `--prefix` (if any) and the basename of <file>. + -The `<path>` cannot contain any colon, the file mode is limited to -a regular file, and the option may be subject to platform-dependent -command-line limits. For non-trivial cases, write an untracked file -and use `--add-file` instead. +The `<path>` argument can start and end with a literal double-quote +character; The contained file name is interpreted as a C-style string, +i.e. the backslash is interpreted as escape character. The path must +be quoted if it contains a colon, to avoid the colon from being +misinterpreted as the separator between the path and the contents, or +if the path begins or ends with a double-quote character. ++ +The file mode is limited to a regular file, and the option may be +subject to platform-dependent command-line limits. For non-trivial +cases, write an untracked file and use `--add-file` instead. --worktree-attributes:: Look for attributes in .gitattributes files in the working tree diff --git a/archive.c b/archive.c index d20e16fa819..b7756b91200 100644 --- a/archive.c +++ b/archive.c @@ -9,6 +9,7 @@ #include "parse-options.h" #include "unpack-trees.h" #include "dir.h" +#include "quote.h" static char const * const archive_usage[] = { N_("git archive [<options>] <tree-ish> [<path>...]"), @@ -533,22 +534,31 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset) die(_("Not a regular file: %s"), path); info->content = NULL; /* read the file later */ } else if (!strcmp(opt->long_name, "add-virtual-file")) { - const char *colon = strchr(arg, ':'); - char *p; + struct strbuf buf = STRBUF_INIT; + const char *p = arg; + + if (*p != '"') + p = strchr(p, ':'); + else if (unquote_c_style(&buf, p, &p) < 0) + die(_("unclosed quote: '%s'"), arg); - if (!colon) + if (!p || *p != ':') die(_("missing colon: '%s'"), arg); - p = xstrndup(arg, colon - arg); - if (!args->prefix) - path = p; - else { - path = prefix_filename(args->prefix, p); - free(p); + if (p == arg) + die(_("empty file name: '%s'"), arg); + + path = buf.len ? + strbuf_detach(&buf, NULL) : xstrndup(arg, p - arg); + + if (args->prefix) { + char *save = path; + path = prefix_filename(args->prefix, path); + free(save); } memset(&info->stat, 0, sizeof(info->stat)); info->stat.st_mode = S_IFREG | 0644; - info->content = xstrdup(colon + 1); + info->content = xstrdup(p + 1); info->stat.st_size = strlen(info->content); } else { BUG("add_file_cb() called for %s", opt->long_name); diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh index ebc26e89a9b..3a5a052e8ce 100755 --- a/t/t5003-archive-zip.sh +++ b/t/t5003-archive-zip.sh @@ -207,13 +207,21 @@ check_zip with_untracked check_added with_untracked untracked untracked test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' ' + if test_have_prereq FUNNYNAMES + then + PATHNAME=quoted:colon + else + PATHNAME=quoted + fi && git archive --format=zip >with_file_with_content.zip \ + --add-virtual-file=\"$PATHNAME\": \ --add-virtual-file=hello:world $EMPTY_TREE && test_when_finished "rm -rf tmp-unpack" && mkdir tmp-unpack && ( cd tmp-unpack && "$GIT_UNZIP" ../with_file_with_content.zip && test_path_is_file hello && + test_path_is_file $PATHNAME && test world = $(cat hello) ) '