From patchwork Sun Mar 4 21:29:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Harald van Dijk X-Patchwork-Id: 10257837 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4FDD760365 for ; Sun, 4 Mar 2018 21:28:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2F3D228833 for ; Sun, 4 Mar 2018 21:28:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 23BF228843; Sun, 4 Mar 2018 21:28:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI, T_TVD_MIME_EPI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 464772883C for ; Sun, 4 Mar 2018 21:28:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932083AbeCDV2a (ORCPT ); Sun, 4 Mar 2018 16:28:30 -0500 Received: from home.gigawatt.nl ([83.163.3.213]:51402 "EHLO home.gigawatt.nl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752018AbeCDV22 (ORCPT ); Sun, 4 Mar 2018 16:28:28 -0500 Received: from [IPv6:2001:980:4809:1:e045:1301:c405:78bf] (unknown [IPv6:2001:980:4809:1:e045:1301:c405:78bf]) by home.gigawatt.nl (Postfix) with ESMTPSA id 940355402947; Sun, 4 Mar 2018 21:28:25 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 home.gigawatt.nl 940355402947 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gigawatt.nl; s=default; t=1520198906; bh=/iBJBKfqa62B+YkGq4CxpVx95xdoXuPnaygUWrctO/g=; l=29438; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=CzZOlv+AXtSVdB2DBK6WNzK40/CIc7tCsexGG+JfKAZb2drT4uXcjOL1OYu7NyU7O 6s/Q+z138NLJqYMTVDXYgR2rnet++AHL+wNtiTQfiyw1XW2VzY5iIu0EYLT3V8OLYV MF+GonnLmHUFuQE2FSLErOB2bGqN8XCLIzWT17Rk= Subject: Re: dash bug: double-quoted "\" breaks glob protection for next char To: Martijn Dekker , Herbert Xu Cc: Denys Vlasenko , dash@vger.kernel.org References: <9f37ae19-6f74-f527-aa49-dd04c3c010f6@gigawatt.nl> <73e4ad51-1c3b-3173-429f-401296244869@gigawatt.nl> <20180224003344.GA3354@gondor.apana.org.au> <32935756-b1c4-70bc-2e72-4d2b0cb2a835@gigawatt.nl> <20180224165224.GA3864@gondor.apana.org.au> <86692fea-c33f-d26d-3b26-6e43bc22a0ee@gigawatt.nl> <20180302074922.GA19418@gondor.apana.org.au> <4242819b-4aee-1238-203f-ec08d001be05@gigawatt.nl> <7dac7df9-4093-095e-dd71-2d7383edd8c3@inlv.org> From: Harald van Dijk Message-ID: <041881f9-9084-4083-345a-8f85792b48ef@gigawatt.nl> Date: Sun, 4 Mar 2018 22:29:25 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:58.0) Gecko/20100101 Thunderbird/58.0 MIME-Version: 1.0 In-Reply-To: <7dac7df9-4093-095e-dd71-2d7383edd8c3@inlv.org> Content-Language: en-GB Sender: dash-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: dash@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 3/4/18 9:08 PM, Martijn Dekker wrote: > Op 04-03-18 om 16:46 schreef Harald van Dijk: >> FreeBSD sh also prints a blank line here. > [...] >> Like above, FreeBSD sh behaves like ksh. > > I stand corrected. > > Is there any port of FreeBSD sh to other operating systems? It would be > much more convenient for me to include it in my tests if I didn't have > to launch a FreeBSD VM and rsync & run the test scripts separately. None that I know of. Running the test script over ssh might be slightly less difficult, but nothing as easy as a port. The source code contains several very much FreeBSD-specific bits. >> Yes, the inconsistency should be fixed. Either it should be treated as >> quoted or as unquoted, but not quoted-unless-it-comes-from-a-variable. I >> have no strong feelings on which it should be. > > Neither do I, so I would default to the behaviour that both pre-exists > in dash and corresponds with the majority of other shells. I went for the behaviour that required the fewest changes for now, which is to treat them as unquoted. If it is agreed that it should be quoted, it requires some additional (minor) complications in the parser, because the existing state would no longer be sufficient to determine whether } should end the substitution. But yes, I agree that given how long dash has treated this as quoted, it makes sense to keep that, unless there's a compelling reason not to. > [...] >>> $ src/dash -c 'printf "%s\n" "${$+\}}"' >>> \} >>> >>> Expected output: }  (no backslash), as in bash 4, yash, ksh93, pdksh, >>> mksh, zsh. In other words: it should be possible to escape a '}' with a >>> backslash within the parameter expansion, even if the expansion is >>> quoted. >>> >>> POSIX ref.: 2.6.2 Parameter Expansion >>> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02 >>> >>> | Any '}' escaped by a or within a quoted string, and >>> | characters in embedded arithmetic expansions, command substitutions, >>> | and variable expansions, shall not be examined in determining the >>> | matching '}'. >> >> I believe this actually requires dash's behaviour. This says the first } >> isn't examined in determining the matching '}', but only that: it just >> says the parameter expansion expression is $+\}. It doesn't say the >> backslash is removed. > > I believe the word "escaped" implies that removal. If a '}' is escaped > by a backslash, it's implied that the backslash is removed as this > escaping is parsed, just as it's implied that quotes are removed from a > quoted string. That's not implied, that's stated: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_07 > The quote characters ( , single-quote, and double-quote) that were present in the original word shall be removed unless they have themselves been quoted. In this case, the backslash was quoted, so this doesn't apply. >> I agree that it would be much better to print } here though. > > All other current shells except bosh (schilytools sh) agree, too -- even > FreeBSD sh, and I checked it this time. Shells agree on the simple cases to remove the backslash: ${x+\}} "${x+\}}" < startloc) { + if ((flag & (EXP_WORD | EXP_QUOTED)) == EXP_WORD && newloc > startloc) { recordregion(startloc, newloc, 0); } startloc = newloc; @@ -316,15 +318,9 @@ start: case CTLENDVAR: /* ??? */ goto breakloop; case CTLQUOTEMARK: - inquotes ^= EXP_QUOTED; - /* "$@" syntax adherence hack */ - if (inquotes && !memcmp(p, dolatstr + 1, - DOLATSTRLEN - 1)) { - p = evalvar(p + 1, flag | inquotes) + 1; - goto start; - } + flag ^= EXP_QUOTED; addquote: - if (flag & QUOTES_ESC) { + if (quotes) { p--; length++; startloc++; @@ -333,27 +329,26 @@ addquote: case CTLESC: startloc++; length++; - - /* - * Quoted parameter expansion pattern: remove quote - * unless inside inner quotes or we have a literal - * backslash. - */ - if (((flag | inquotes) & (EXP_QPAT | EXP_QUOTED)) == - EXP_QPAT && *p != '\\') - break; - goto addquote; case CTLVAR: - p = evalvar(p, flag | inquotes); + /* "$@" syntax adherence hack */ + dolatstrhack = !memcmp(p, dolatstr+1, DOLATSTRLEN-1) && !shellparam.nparam && quotes; + p = evalvar(p, flag); + if (dolatstrhack && prev == (char)CTLQUOTEMARK && *p == (char)CTLQUOTEMARK) { + expdest--; + flag ^= EXP_QUOTED; + p++; + } goto start; case CTLBACKQ: - expbackq(argbackq->n, flag | inquotes); + c = 0; + case CTLBACKQ|CTLQUOTE: + expbackq(argbackq->n, c, quotes); argbackq = argbackq->next; goto start; case CTLENDARI: p--; - expari(flag | inquotes); + expari(quotes); goto start; } } @@ -449,11 +444,12 @@ removerecordregions(int endoff) * evaluate, place result in (backed up) result, adjust string position. */ void -expari(int flag) +expari(int quotes) { struct stackmark sm; char *p, *start; int begoff; + char flag; int len; intmax_t result; @@ -468,42 +464,24 @@ expari(int flag) p = expdest; pushstackmark(&sm, p - start); *--p = '\0'; - p--; - do { - int esc; - - while (*p != (char)CTLARI) { - p--; -#ifdef DEBUG - if (p < start) { - sh_error("missing CTLARI (shouldn't happen)"); - } -#endif - } - - esc = esclen(start, p); - if (!(esc % 2)) { - break; - } - - p -= esc + 1; - } while (1); - + p = (char *)findstartchar(start, p, CTLARI, CTLENDARI); begoff = p - start; removerecordregions(begoff); + flag = p[1] & VSSYNTAX; + expdest = p; - if (likely(flag & QUOTES_ESC)) - rmescapes(p + 1); + if (likely(quotes)) + rmescapes(p + 2); - result = arith(p + 1); + result = arith(p + 2); popstackmark(&sm); len = cvtnum(result); - if (likely(!(flag & EXP_QUOTED))) + if (likely(!flag)) recordregion(begoff, begoff + len, 0); } @@ -513,7 +491,7 @@ expari(int flag) */ STATIC void -expbackq(union node *cmd, int flag) +expbackq(union node *cmd, int quoted, int quotes) { struct backcmd in; int i; @@ -521,7 +499,7 @@ expbackq(union node *cmd, int flag) char *p; char *dest; int startloc; - char const *syntax = flag & EXP_QUOTED ? DQSYNTAX : BASESYNTAX; + char const *syntax = quoted ? DQSYNTAX : BASESYNTAX; struct stackmark smark; INTOFF; @@ -535,7 +513,7 @@ expbackq(union node *cmd, int flag) if (i == 0) goto read; for (;;) { - memtodest(p, i, syntax, flag & QUOTES_ESC); + memtodest(p, i, syntax, quotes); read: if (in.fd < 0) break; @@ -562,7 +540,7 @@ read: STUNPUTC(dest); expdest = dest; - if (!(flag & EXP_QUOTED)) + if (!quoted) recordregion(startloc, dest - (char *)stackblock(), 0); TRACE(("evalbackq: size=%d: \"%.*s\"\n", (dest - (char *)stackblock()) - startloc, @@ -639,9 +617,8 @@ scanright( } STATIC const char * -subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varflags, int flag) +subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varflags, int quotes) { - int quotes = flag & QUOTES_ESC; char *startp; char *loc; struct nodelist *saveargbackq = argbackq; @@ -651,8 +628,7 @@ subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varfla char *(*scan)(char *, char *, char *, char *, int , int); argstr(p, EXP_TILDE | (subtype != VSASSIGN && subtype != VSQUESTION ? - (flag & (EXP_QUOTED | EXP_QPAT) ? - EXP_QPAT : EXP_CASE) : 0)); + EXP_CASE : 0)); STPUTC('\0', expdest); argbackq = saveargbackq; startp = stackblock() + startloc; @@ -722,22 +698,25 @@ evalvar(char *p, int flag) int startloc; ssize_t varlen; int easy; + int quotes; int quoted; + quotes = flag & QUOTES_ESC; varflags = *p++; subtype = varflags & VSTYPE; if (!subtype) sh_error("Bad substitution"); - quoted = flag & EXP_QUOTED; + quoted = varflags & VSQUOTE; var = p; easy = (!quoted || (*var == '@' && shellparam.nparam)); + startloc = expdest - (char *)stackblock(); p = strchr(p, '=') + 1; again: - varlen = varvalue(var, varflags, flag, "ed); + varlen = varvalue(var, varflags, flag, quoted); if (varflags & VSNUL) varlen--; @@ -749,7 +728,8 @@ again: if (subtype == VSMINUS) { vsplus: if (varlen < 0) { - argstr(p, flag | EXP_TILDE | EXP_WORD); + argstr(p, flag | EXP_TILDE | EXP_WORD | + (quoted ? EXP_QUOTED : 0)); goto end; } goto record; @@ -759,8 +739,7 @@ vsplus: if (varlen >= 0) goto record; - subevalvar(p, var, 0, subtype, startloc, varflags, - flag & ~QUOTES_ESC); + subevalvar(p, var, 0, subtype, startloc, varflags, 0); varflags &= ~VSNUL; /* * Remove any recorded regions beyond @@ -806,7 +785,7 @@ record: STPUTC('\0', expdest); patloc = expdest - (char *)stackblock(); if (subevalvar(p, NULL, patloc, subtype, - startloc, varflags, flag) == 0) { + startloc, varflags, quotes) == 0) { int amount = expdest - ( (char *)stackblock() + patloc - 1 ); @@ -823,7 +802,7 @@ end: for (;;) { if ((c = (signed char)*p++) == CTLESC) p++; - else if (c == CTLBACKQ) { + else if (c == CTLBACKQ || c == (CTLBACKQ|CTLQUOTE)) { if (varlen >= 0) argbackq = argbackq->next; } else if (c == CTLVAR) { @@ -887,7 +866,7 @@ strtodest(p, syntax, quotes) */ STATIC ssize_t -varvalue(char *name, int varflags, int flags, int *quotedp) +varvalue(char *name, int varflags, int flags, int quoted) { int num; char *p; @@ -896,7 +875,6 @@ varvalue(char *name, int varflags, int flags, int *quotedp) char sepc; char **ap; char const *syntax; - int quoted = *quotedp; int subtype = varflags & VSTYPE; int discard = subtype == VSPLUS || subtype == VSLENGTH; int quotes = (discard ? 0 : (flags & QUOTES_ESC)) | QUOTES_KEEPNUL; @@ -942,7 +920,6 @@ numvar: sep |= ifsset() ? ifsval()[0] : ' '; param: sepc = sep; - *quotedp = !sepc; if (!(ap = shellparam.p)) return -1; while ((p = *ap++)) { @@ -1644,7 +1621,6 @@ char * _rmescapes(char *str, int flag) { char *p, *q, *r; - unsigned inquotes; int notescaped; int globbing; @@ -1674,24 +1650,23 @@ _rmescapes(char *str, int flag) q = mempcpy(q, str, len); } } - inquotes = 0; globbing = flag & RMESCAPE_GLOB; notescaped = globbing; while (*p) { if (*p == (char)CTLQUOTEMARK) { - inquotes = ~inquotes; p++; notescaped = globbing; continue; } + if (*p == '\\') { + /* naked back slash */ + notescaped = 0; + goto copy; + } if (*p == (char)CTLESC) { p++; if (notescaped) *q++ = '\\'; - } else if (*p == '\\' && !inquotes) { - /* naked back slash */ - notescaped = 0; - goto copy; } notescaped = globbing; copy: diff --git a/src/expand.h b/src/expand.h index 26dc5b4..90f5328 100644 --- a/src/expand.h +++ b/src/expand.h @@ -55,7 +55,6 @@ struct arglist { #define EXP_VARTILDE 0x4 /* expand tildes in an assignment */ #define EXP_REDIR 0x8 /* file glob for a redirection (1 match only) */ #define EXP_CASE 0x10 /* keeps quotes around for CASE pattern */ -#define EXP_QPAT 0x20 /* pattern in quoted parameter expansion */ #define EXP_VARTILDE2 0x40 /* expand tildes after colons only */ #define EXP_WORD 0x80 /* expand word in parameter expansion */ #define EXP_QUOTED 0x100 /* expand word in double quotes */ diff --git a/src/jobs.c b/src/jobs.c index 4f02e38..6ba6b48 100644 --- a/src/jobs.c +++ b/src/jobs.c @@ -1375,7 +1375,6 @@ cmdputs(const char *s) char *nextc; signed char c; int subtype = 0; - int quoted = 0; static const char vstype[VSTYPE + 1][4] = { "", "}", "-", "+", "?", "=", "%", "%%", "#", "##", @@ -1397,11 +1396,11 @@ cmdputs(const char *s) str = "${"; goto dostr; case CTLENDVAR: - str = "\"}" + !(quoted & 1); - quoted >>= 1; + str = "}"; subtype = 0; goto dostr; case CTLBACKQ: + case CTLBACKQ|CTLQUOTE: str = "$(...)"; goto dostr; case CTLARI: @@ -1411,14 +1410,11 @@ cmdputs(const char *s) str = "))"; goto dostr; case CTLQUOTEMARK: - quoted ^= 1; c = '"'; break; case '=': if (subtype == 0) break; - if ((subtype & VSTYPE) != VSNORMAL) - quoted <<= 1; str = vstype[subtype & VSTYPE]; if (subtype & VSNUL) c = ':'; @@ -1446,9 +1442,6 @@ dostr: USTPUTC(c, nextc); } } - if (quoted & 1) { - USTPUTC('"', nextc); - } *nextc = 0; cmdnextc = nextc; } diff --git a/src/mksyntax.c b/src/mksyntax.c index a23c18c..41c9ceb 100644 --- a/src/mksyntax.c +++ b/src/mksyntax.c @@ -145,7 +145,8 @@ main(int argc, char **argv) fprintf(hfile, "/* %s */\n", is_entry[i].comment); } putc('\n', hfile); - fprintf(hfile, "#define SYNBASE %d\n", 130); + fprintf(hfile, "#define SYNBASE %d\n", 131); + fprintf(hfile, "#define PVSSYNTAX %d\n", -131); fprintf(hfile, "#define PEOF %d\n\n", -130); fprintf(hfile, "#define PEOA %d\n\n", -129); putc('\n', hfile); @@ -158,6 +159,7 @@ main(int argc, char **argv) putc('\n', hfile); /* Generate the syntax tables. */ + fputs("#include \"parser.h\"\n\n", cfile); fputs("#include \"shell.h\"\n", cfile); fputs("#include \"syntax.h\"\n\n", cfile); init(); @@ -170,7 +172,8 @@ main(int argc, char **argv) add("$", "CVAR"); add("}", "CENDVAR"); add("<>();&| \t", "CSPCL"); - syntax[1] = "CSPCL"; + syntax[0] = "0"; + syntax[2] = "CSPCL"; print("basesyntax"); init(); fputs("\n/* syntax table used when in double quotes */\n", cfile); @@ -182,6 +185,7 @@ main(int argc, char **argv) add("}", "CENDVAR"); /* ':/' for tilde expansion, '-' for [a\-x] pattern ranges */ add("!*?[=~:/-]", "CCTL"); + syntax[0] = "VSQUOTE"; print("dqsyntax"); init(); fputs("\n/* syntax table used when in single quotes */\n", cfile); @@ -189,6 +193,7 @@ main(int argc, char **argv) add("'", "CENDQUOTE"); /* ':/' for tilde expansion, '-' for [a\-x] pattern ranges */ add("!*?[=~:/-]\\", "CCTL"); + syntax[0] = "0"; print("sqsyntax"); init(); fputs("\n/* syntax table used when in arithmetic */\n", cfile); @@ -199,6 +204,7 @@ main(int argc, char **argv) add("}", "CENDVAR"); add("(", "CLP"); add(")", "CRP"); + syntax[0] = "VSARITH"; print("arisyntax"); filltable("0"); fputs("\n/* character classification table */\n", cfile); @@ -223,7 +229,7 @@ filltable(char *dftval) { int i; - for (i = 0 ; i < 258; i++) + for (i = 0 ; i < 259; i++) syntax[i] = dftval; } @@ -238,10 +244,10 @@ init(void) int ctl; filltable("CWORD"); - syntax[0] = "CEOF"; - syntax[1] = "CIGN"; + syntax[1] = "CEOF"; + syntax[2] = "CIGN"; for (ctl = CTL_FIRST; ctl <= CTL_LAST; ctl++ ) - syntax[130 + ctl] = "CCTL"; + syntax[131 + ctl] = "CCTL"; } @@ -253,7 +259,7 @@ static void add(char *p, char *type) { while (*p) - syntax[(signed char)*p++ + 130] = type; + syntax[(signed char)*p++ + 131] = type; } @@ -271,7 +277,7 @@ print(char *name) fprintf(hfile, "extern const char %s[];\n", name); fprintf(cfile, "const char %s[] = {\n", name); col = 0; - for (i = 0 ; i < 258; i++) { + for (i = 0 ; i < 259; i++) { if (i == 0) { fputs(" ", cfile); } else if ((i & 03) == 0) { diff --git a/src/mystring.c b/src/mystring.c index 0106bd2..a0d5e47 100644 --- a/src/mystring.c +++ b/src/mystring.c @@ -60,8 +60,7 @@ char nullstr[1]; /* zero length string */ const char spcstr[] = " "; const char snlfmt[] = "%s\n"; -const char dolatstr[] = { CTLQUOTEMARK, CTLVAR, VSNORMAL, '@', '=', - CTLQUOTEMARK, '\0' }; +const char dolatstr[] = { CTLVAR, VSNORMAL|VSQUOTE, '@', '=', '\0' }; const char qchars[] = { CTLESC, CTLQUOTEMARK, 0 }; const char illnum[] = "Illegal number: %s"; const char homestr[] = "HOME"; diff --git a/src/mystring.h b/src/mystring.h index 083ea98..3a82f05 100644 --- a/src/mystring.h +++ b/src/mystring.h @@ -40,7 +40,7 @@ extern const char snlfmt[]; extern const char spcstr[]; extern const char dolatstr[]; -#define DOLATSTRLEN 6 +#define DOLATSTRLEN 4 extern const char qchars[]; extern const char illnum[]; extern const char homestr[]; diff --git a/src/parser.c b/src/parser.c index 382658e..f8c95bc 100644 --- a/src/parser.c +++ b/src/parser.c @@ -876,24 +876,16 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs) size_t len; struct nodelist *bqlist; int quotef; - int dblquote; + int nhere; int varnest; /* levels of variables expansion */ - int arinest; /* levels of arithmetic expansion */ int parenlevel; /* levels of parens in arithmetic */ - int dqvarnest; /* levels of variables expansion within double quotes */ int oldstyle; - /* syntax before arithmetic */ - char const *uninitialized_var(prevsyntax); - dblquote = 0; - if (syntax == DQSYNTAX) - dblquote = 1; + nhere = eofmark && syntax == SQSYNTAX; quotef = 0; bqlist = NULL; varnest = 0; - arinest = 0; parenlevel = 0; - dqvarnest = 0; STARTSTACKSTR(out); loop: { /* for each line, until end of word */ @@ -922,7 +914,7 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs) USTPUTC(c, out); break; case CCTL: - if (eofmark == NULL || dblquote) + if (!nhere) USTPUTC(CTLESC, out); USTPUTC(c, out); break; @@ -937,13 +929,17 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs) nlprompt(); } else { if ( - dblquote && + syntax != BASESYNTAX && c != '\\' && c != '`' && c != '$' && ( c != '"' || eofmark != NULL + ) && ( + c != '}' || + !varnest ) ) { + USTPUTC(CTLESC, out); USTPUTC('\\', out); } USTPUTC(CTLESC, out); @@ -960,16 +956,12 @@ quotemark: break; case CDQUOTE: syntax = DQSYNTAX; - dblquote = 1; goto quotemark; case CENDQUOTE: if (eofmark && !varnest) USTPUTC(c, out); else { - if (dqvarnest == 0) { - syntax = BASESYNTAX; - dblquote = 0; - } + syntax = BASESYNTAX; quotef++; goto quotemark; } @@ -979,14 +971,18 @@ quotemark: break; case CENDVAR: /* '}' */ if (varnest > 0) { - varnest--; - if (dqvarnest > 0) { - dqvarnest--; + const char *startchar = findstartchar((char *)stackblock(), out, CTLVAR, CTLENDVAR); + char vstype = startchar[1] & VSTYPE; + char vssyntax = startchar[1] & VSSYNTAX; + const char *prevsyntax = vssyntax == (char)VSARITH ? ARISYNTAX : vssyntax == (char)VSQUOTE ? DQSYNTAX : BASESYNTAX; + if (syntax == (prevsyntax == BASESYNTAX || (vstype >= VSTRIM_FIRST && vstype <= VSTRIM_LAST) ? BASESYNTAX : DQSYNTAX)) { + syntax = prevsyntax; + varnest--; + USTPUTC(CTLENDVAR, out); + break; } - USTPUTC(CTLENDVAR, out); - } else { - USTPUTC(c, out); } + USTPUTC(c, out); break; case CLP: /* '(' in arithmetic */ parenlevel++; @@ -999,8 +995,9 @@ quotemark: } else { if (pgetc() == ')') { USTPUTC(CTLENDARI, out); - if (!--arinest) - syntax = prevsyntax; + + char type = findstartchar((char *)stackblock(), out - 1, CTLARI, CTLENDARI)[1] & VSSYNTAX; + syntax = type == (char)VSARITH ? ARISYNTAX : type == (char)VSQUOTE ? DQSYNTAX : BASESYNTAX; } else { /* * unbalanced parens @@ -1289,12 +1286,13 @@ varname: badsub: pungetc(); } - *((char *)stackblock() + typeloc) = subtype; + const char *prevsyntax = syntax; if (subtype != VSNORMAL) { varnest++; - if (dblquote) - dqvarnest++; + syntax = syntax == BASESYNTAX || (subtype >= VSTRIM_FIRST && subtype <= VSTRIM_LAST) ? BASESYNTAX : DQSYNTAX; } + subtype |= prevsyntax[PVSSYNTAX]; + *((char *)stackblock() + typeloc) = subtype; STPUTC('=', out); } goto parsesub_return; @@ -1352,7 +1350,7 @@ parsebackq: { continue; } if (pc != '\\' && pc != '`' && pc != '$' - && (!dblquote || pc != '"')) + && (syntax == BASESYNTAX || pc != '"')) STPUTC('\\', pout); if (pc > PEOA) { break; @@ -1416,7 +1414,10 @@ done: memcpy(out, str, savelen); STADJUST(savelen, out); } - USTPUTC(CTLBACKQ, out); + if (syntax != BASESYNTAX) + USTPUTC(CTLBACKQ | CTLQUOTE, out); + else + USTPUTC(CTLBACKQ, out); if (oldstyle) goto parsebackq_oldreturn; else @@ -1428,11 +1429,9 @@ done: */ parsearith: { - if (++arinest == 1) { - prevsyntax = syntax; - syntax = ARISYNTAX; - } USTPUTC(CTLARI, out); + USTPUTC(VSTYPE | syntax[PVSSYNTAX], out); + syntax = ARISYNTAX; goto parsearith_return; } @@ -1466,6 +1465,39 @@ endofname(const char *name) } +const char * +findstartchar(const char *start, const char *p, char open, char close) { + int nest = 1; + const char *q; + for (;; ) { + int d; + + --p; + +#if DEBUG + if (p < start) + sh_error("missing start char (shouldn't happen)"); +#endif + + if (*p == open) { + if ((p[1] & VSTYPE) == VSNORMAL) + continue; + + d = -1; + checkescapes: + for (q = p; q != start && q[-1] == (char)CTLESC; q--) + ; + + if ((p - q) % 2 == 0 && !(nest += d)) + return p; + } else if (*p == close) { + d = 1; + goto checkescapes; + } + } +} + + /* * Called when an unexpected token is read during the parse. The argument * is the token that is expected, or -1 if more than one type of token can @@ -1540,7 +1572,7 @@ expandstr(const char *ps) n.narg.text = wordtext; n.narg.backquote = backquotelist; - expandarg(&n, NULL, EXP_QUOTED); + expandarg(&n, NULL, 0); return stackblock(); } diff --git a/src/parser.h b/src/parser.h index 2875cce..d239043 100644 --- a/src/parser.h +++ b/src/parser.h @@ -42,14 +42,19 @@ #define CTLVAR -126 /* variable defn */ #define CTLENDVAR -125 #define CTLBACKQ -124 +#define CTLQUOTE 01 /* ored with CTLBACKQ code if in quotes */ +/* CTLBACKQ | CTLQUOTE == -123 */ #define CTLARI -122 /* arithmetic expression */ #define CTLENDARI -121 #define CTLQUOTEMARK -120 #define CTL_LAST -120 /* last 'special' character */ -/* variable substitution byte (follows CTLVAR) */ +/* variable substitution byte (follows CTLVAR), values picked to be distinct from control characters */ #define VSTYPE 0x0f /* type of variable substitution */ #define VSNUL 0x10 /* colon--treat the empty string as unset */ +#define VSSYNTAX 0xc0 +#define VSQUOTE 0x40 /* inside double quotes--suppress splitting */ +#define VSARITH 0xc0 /* inside $((...)) arithmetic */ /* values of VSTYPE field */ #define VSNORMAL 0x1 /* normal variable: $var or ${var} */ @@ -57,10 +62,12 @@ #define VSPLUS 0x3 /* ${var+text} */ #define VSQUESTION 0x4 /* ${var?message} */ #define VSASSIGN 0x5 /* ${var=text} */ +#define VSTRIM_FIRST 0x6 #define VSTRIMRIGHT 0x6 /* ${var%pattern} */ #define VSTRIMRIGHTMAX 0x7 /* ${var%%pattern} */ #define VSTRIMLEFT 0x8 /* ${var#pattern} */ #define VSTRIMLEFTMAX 0x9 /* ${var##pattern} */ +#define VSTRIM_LAST 0x9 #define VSLENGTH 0xa /* ${#var} */ /* values of checkkwd variable */ @@ -88,6 +95,7 @@ const char *getprompt(void *); const char *const *findkwd(const char *); char *endofname(const char *); const char *expandstr(const char *); +const char *findstartchar(const char *, const char *, char, char); static inline int goodname(const char *p) diff --git a/src/redir.c b/src/redir.c index f96a76b..527b3be 100644 --- a/src/redir.c +++ b/src/redir.c @@ -304,7 +304,7 @@ openhere(union node *redir) p = redir->nhere.doc->narg.text; if (redir->type == NXHERE) { - expandarg(redir->nhere.doc, NULL, EXP_QUOTED); + expandarg(redir->nhere.doc, NULL, 0); p = stackblock(); } diff --git a/src/show.c b/src/show.c index 4a049e9..839a40a 100644 --- a/src/show.c +++ b/src/show.c @@ -222,6 +222,7 @@ sharg(union node *arg, FILE *fp) putc('}', fp); break; case CTLBACKQ: + case CTLBACKQ|CTLQUOTE: putc('$', fp); putc('(', fp); shtree(bqlist->n, -1, NULL, fp); @@ -314,6 +315,7 @@ trstring(char *s) case CTLESC: c = 'e'; goto backslash; case CTLVAR: c = 'v'; goto backslash; case CTLBACKQ: c = 'q'; goto backslash; + case CTLBACKQ|CTLQUOTE: c = 'Q'; goto backslash; backslash: putc('\\', tracefile); putc(c, tracefile); break;