Message ID | e50519c3-4055-ab2a-b9f3-0776369e4106@inlv.org (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Herbert Xu |
Headers | show |
On 27/02/2017 07:22, Martijn Dekker wrote: > The xtrace (set -x) output in dash is a bit of a pain, because arguments > containing whitespace aren't quoted. This can it impossible to tell > which bit belongs to which argument: Agreed. > $ dash -c 'set -x; printf "%s\n" one "two three" four' > + printf %s\n one two three four > one > two three > four > > Another disadvantage is you cannot simply copy and paste the commands > from this xtrace output to repeat them for debugging purposes. > > I wrote the attached patch which fixes this. I've been testing it for > about a week and had no issues. > > The patch changes the following: > > - Since we don't want every command name and argument quoted but only > those containing non-shell-safe characters, single_quote() has acquired > an extra argument indicating whether quoting should be conditional upon > the string containing non-shell-safe characters (1) or unconditional > (0). Shell-safe characters are defined as this set: > 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz%+,./:=@_^!- > > - Quoting of variable assignments preceding commands needs to be handled > specially; in order for the output to be suitable for re-entry into the > shell, only the part after the "=" must be quoted. For this purpose, I > changed eprintlist() into two functions, eprintvarlist() to print the > variable assignments and eprintarglist() to print the rest. The general approach looks nice. I think this misses a few cases though if you're aiming to make it suitable for re-entry: $ src/dash -c 'set -x; \!' + ! src/dash: 1: !: not found $ src/dash -c 'set -x; \a=b' + a=b src/dash: 1: a=b: not found $ src/dash -c 'set -x; \if' + if src/dash: 1: if: not found I could create binaries by those names to get rid of the errors. Regardless, in none these cases would the original command be re-executed if the set -x output were entered into the shell. Aside from the first, bash doesn't quote them either. It may be okay to leave this as it is and acknowledge that it doesn't handle all cases, as long as it handles enough of them to be an improvement. Depending on what happens with <http://austingroupbugs.net/view.php?id=249>, if support for $'...' will become required in a future version of POSIX, it may additionally make sense to use that syntax for control characters already right now. It feels like the set of shell-safe characters should be able to use the generated syntax.[ch] files, but none of the existing categories quite matches. If the modification to single_quote can handle all cases (perhaps quoting unnecessarily if it's unclear), then the conditional parameter could be removed and applied unconditionally: as far as I can tell, there are no existing calls that require the single quotes for simple words. > Hope this is useful, > > - Martijn It looks useful to me. :) Cheers, Harald van Dijk -- To unsubscribe from this list: send the line "unsubscribe dash" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Op 27-02-17 om 20:24 schreef Harald van Dijk: > The general approach looks nice. I think this misses a few cases though > if you're aiming to make it suitable for re-entry: > > $ src/dash -c 'set -x; \!' > + ! > src/dash: 1: !: not found > > $ src/dash -c 'set -x; \a=b' > + a=b > src/dash: 1: a=b: not found > > $ src/dash -c 'set -x; \if' > + if > src/dash: 1: if: not found > > I could create binaries by those names to get rid of the errors. > Regardless, in none these cases would the original command be > re-executed if the set -x output were entered into the shell. Good point. The first and second cases would be fixed by simply removing "!" and "=" from SHELLSAFECHARS. Not a big loss. Shell reserved words (a.k.a. shell keywords) such as "if" are harder to fix. The single_quote() routine, if told to quote conditionally, would have to iterate through the internal list of shell reserved words and quote the string if it matches one of those. I will see if I can make that happen. Actually, after testing bash, yash, zsh, ksh93 and mksh as well as dash, it looks like no current shell manages to quote strings that would otherwise be shell keywords in xtrace output, with the exception of the "!" keyword. So fixing this would make dash's xtrace output better than any other shell's. :) Since shell reserved words are part of the grammar, they never show up in xtrace output, so at least the output is unambiguous: you know anything that looks like a reserved word isn't. > It may be okay to leave this as it is and acknowledge that it doesn't > handle all cases, as long as it handles enough of them to be an > improvement. Yes. I would still remove "!" and "=" from SHELLSAFECHARS though; at least that way you won't get output that looks like a negation or assignment but isn't. > If the modification to single_quote can handle all cases (perhaps > quoting unnecessarily if it's unclear), then the conditional > parameter could be removed and applied unconditionally: as far as I > can tell, there are no existing calls that require the single quotes > for simple words. True, but this would change the output of something like: $ dash -c 'alias stuff=xyz; trap true EXIT; alias; trap' from: stuff='xyz' trap -- 'true' EXIT to: stuff=xyz trap -- true EXIT While there is technically nothing wrong with this, I think it's nicer to leave those quoted unconditionally, because that facilitates editing the output of those commands to feed it back to the shell. - M. -- To unsubscribe from this list: send the line "unsubscribe dash" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 27/02/2017 21:08, Martijn Dekker wrote: > Op 27-02-17 om 20:24 schreef Harald van Dijk: >> If the modification to single_quote can handle all cases (perhaps >> quoting unnecessarily if it's unclear), then the conditional >> parameter could be removed and applied unconditionally: as far as I >> can tell, there are no existing calls that require the single quotes >> for simple words. > > True, but this would change the output of something like: > > $ dash -c 'alias stuff=xyz; trap true EXIT; alias; trap' > > from: > > stuff='xyz' > trap -- 'true' EXIT > > to: > > stuff=xyz > trap -- true EXIT > > While there is technically nothing wrong with this, I think it's nicer > to leave those quoted unconditionally, because that facilitates editing > the output of those commands to feed it back to the shell. Fair point. Looking at other shells: | bash | ksh93 | zsh | posh | dash | dash (patched) alias | 2 | 1 | 1 | - | 2 | 2 trap | 2 | 1 | 1 | 1 | 2 | 2 export -p | 2 | 1 | 1 | 0 | 2 | 2 readonly -p | 2 | 1 | 1 | 0 | 2 | 2 set | 1 | 1 | 1 | - | 2 | 2 set -x | 1 | 1 | 1 | 0 | 0 | 1 Here, 0 means "never quoted", 1 means "quoted when containing special characters", and 2 means "always quoted". Although there is a bit of variation in other shells, your approach brings dash closer to bash, which I believe is considered a goal for dash, all else being equal. My suggestion would bring dash closer to bash in one command, but further from bash in four others, so unless there is a good reason for it, I think I agree that that shouldn't be done. Cheers, Harald van Dijk -- To unsubscribe from this list: send the line "unsubscribe dash" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff -ur dash-0.5.9.1.orig/src/alias.c dash-0.5.9.1/src/alias.c --- dash-0.5.9.1.orig/src/alias.c 2014-09-28 10:19:32.000000000 +0200 +++ dash-0.5.9.1/src/alias.c 2017-02-23 03:01:57.000000000 +0100 @@ -197,7 +197,7 @@ void printalias(const struct alias *ap) { - out1fmt("%s=%s\n", ap->name, single_quote(ap->val)); + out1fmt("%s=%s\n", ap->name, single_quote(ap->val, 0)); } STATIC struct alias ** diff -ur dash-0.5.9.1.orig/src/eval.c dash-0.5.9.1/src/eval.c --- dash-0.5.9.1.orig/src/eval.c 2016-09-02 16:12:23.000000000 +0200 +++ dash-0.5.9.1/src/eval.c 2017-02-23 03:14:56.000000000 +0100 @@ -95,7 +95,8 @@ STATIC int evalbltin(const struct builtincmd *, int, char **, int); STATIC int evalfun(struct funcnode *, int, char **, int); STATIC void prehash(union node *); -STATIC int eprintlist(struct output *, struct strlist *, int); +STATIC int eprintvarlist(struct output *, struct strlist *, int); +STATIC void eprintarglist(struct output *, struct strlist *, int); STATIC int bltincmd(int, char **); @@ -786,8 +787,8 @@ out = &preverrout; outstr(expandstr(ps4val()), out); sep = 0; - sep = eprintlist(out, varlist.list, sep); - eprintlist(out, arglist.list, sep); + sep = eprintvarlist(out, varlist.list, sep); + eprintarglist(out, arglist.list, sep); outcslow('\n', out); #ifdef FLUSHERR flushout(out); @@ -1107,16 +1108,35 @@ STATIC int -eprintlist(struct output *out, struct strlist *sp, int sep) +eprintvarlist(struct output *out, struct strlist *sp, int sep) { while (sp) { const char *p; + int i; - p = " %s" + (1 - sep); + if (sep) + outfmt(out, " "); sep |= 1; - outfmt(out, p, sp->text); + i = 0; + while (sp->text[i] != '=' && sp->text[i] != '\0') + outfmt(out, "%c", sp->text[i++]); + if (sp->text[i] == '=') + outfmt(out, "=%s", single_quote(sp->text+i+1, 1)); sp = sp->next; } return sep; } + +STATIC void +eprintarglist(struct output *out, struct strlist *sp, int sep) +{ + while (sp) { + const char *p; + + p = " %s" + (1 - sep); + sep |= 1; + outfmt(out, p, single_quote(sp->text, 1)); + sp = sp->next; + } +} diff -ur dash-0.5.9.1.orig/src/mystring.c dash-0.5.9.1/src/mystring.c --- dash-0.5.9.1.orig/src/mystring.c 2014-09-28 10:19:32.000000000 +0200 +++ dash-0.5.9.1/src/mystring.c 2017-02-23 03:10:48.000000000 +0100 @@ -184,13 +184,19 @@ /* * Produce a possibly single quoted string suitable as input to the shell. - * The return string is allocated on the stack. + * If 'conditional' is nonzero, quoting is only done if the string contains + * non-shellsafe characters; if it is zero, quoting is always done. + * If quoting was done, the return string is allocated on the stack, + * otherwise a pointer to the original string is returned. */ char * -single_quote(const char *s) { +single_quote(const char *s, int conditional) { char *p; + if (conditional && s[strspn(s, SHELLSAFECHARS)] == '\0') + return (char *)s; + STARTSTACKSTR(p); do { diff -ur dash-0.5.9.1.orig/src/mystring.h dash-0.5.9.1/src/mystring.h --- dash-0.5.9.1.orig/src/mystring.h 2014-09-28 10:19:32.000000000 +0200 +++ dash-0.5.9.1/src/mystring.h 2017-02-23 03:01:57.000000000 +0100 @@ -54,10 +54,13 @@ intmax_t atomax10(const char *); int number(const char *); int is_number(const char *); -char *single_quote(const char *); +char *single_quote(const char *, int); char *sstrdup(const char *); int pstrcmp(const void *, const void *); const char *const *findstring(const char *, const char *const *, size_t); #define equal(s1, s2) (strcmp(s1, s2) == 0) #define scopy(s1, s2) ((void)strcpy(s2, s1)) + +/* Characters that don't need quoting before re-entry into the shell */ +#define SHELLSAFECHARS "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz%+,./:=@_^!-" diff -ur dash-0.5.9.1.orig/src/trap.c dash-0.5.9.1/src/trap.c --- dash-0.5.9.1.orig/src/trap.c 2016-09-02 16:12:23.000000000 +0200 +++ dash-0.5.9.1/src/trap.c 2017-02-23 03:01:57.000000000 +0100 @@ -107,7 +107,7 @@ if (trap[signo] != NULL) { out1fmt( "trap -- %s %s\n", - single_quote(trap[signo]), + single_quote(trap[signo], 0), signal_names[signo] ); } diff -ur dash-0.5.9.1.orig/src/var.c dash-0.5.9.1/src/var.c --- dash-0.5.9.1.orig/src/var.c 2014-10-07 16:30:35.000000000 +0200 +++ dash-0.5.9.1/src/var.c 2017-02-23 03:01:57.000000000 +0100 @@ -417,7 +417,7 @@ p = strchrnul(*ep, '='); q = nullstr; if (*p) - q = single_quote(++p); + q = single_quote(++p, 0); out1fmt("%s%s%.*s%s\n", prefix, sep, (int)(p - *ep), *ep, q); }