Message ID | 20210621095719.GA9287@gondor.apana.org.au (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Herbert Xu |
Headers | show |
Series | parser: Fix VSLENGTH parsing with trailing garbage | expand |
On Mon, Jun 21, 2021 at 11:57 AM Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Sat, Jun 19, 2021 at 02:44:46PM +0200, Denys Vlasenko wrote: > > CTLVAR and CTLBACKQ are not properly handled if encountered > > inside {$#...}. Testcase: > > > > dash -c "`printf 'echo ${#1\x82}'`" 00 111 222 ... > In fact these two bugs are one and the same. This patch fixes > both by detecting the invalid substitution and not emitting it > into the node tree. > > Incidentally this reveals a bug in how we parse ${#10} that got > introduced recently, which is also fixed here. ... > --- a/src/parser.h > +++ b/src/parser.h > @@ -62,6 +62,7 @@ > #define VSTRIMLEFT 0x8 /* ${var#pattern} */ > #define VSTRIMLEFTMAX 0x9 /* ${var##pattern} */ > #define VSLENGTH 0xa /* ${#var} */ > +/* VSLENGTH must come last. */ The above assumption is not necessary if... > diff --git a/src/parser.c b/src/parser.c > index 3c80d17..13c2df5 100644 > --- a/src/parser.c > +++ b/src/parser.c > @@ -1252,7 +1252,8 @@ varname: > do { > STPUTC(c, out); > c = pgetc_eatbnl(); > - } while (!subtype && is_digit(c)); > + } while ((subtype <= 0 || subtype >= VSLENGTH) && > + is_digit(c)); ... you use (subtype == 0 || subtype == VSLENGTH) here. Also, (subtype == 0 || subtype == VSLENGTH) is less confusing: it says "loop if ${VAR} or ${#VAR} syntax", whereas <= >= are a bit misleading.
On Mon, Jun 21, 2021 at 04:21:40PM +0200, Denys Vlasenko wrote: > > > diff --git a/src/parser.c b/src/parser.c > > index 3c80d17..13c2df5 100644 > > --- a/src/parser.c > > +++ b/src/parser.c > > @@ -1252,7 +1252,8 @@ varname: > > do { > > STPUTC(c, out); > > c = pgetc_eatbnl(); > > - } while (!subtype && is_digit(c)); > > + } while ((subtype <= 0 || subtype >= VSLENGTH) && > > + is_digit(c)); > > ... you use (subtype == 0 || subtype == VSLENGTH) here. > Also, (subtype == 0 || subtype == VSLENGTH) is less confusing: > it says "loop if ${VAR} or ${#VAR} syntax", whereas <= >= > are a bit misleading. Yes it looks a bit confusing, but it turns into a single branch instead of two. Perhaps I should add a helper function for it. Thanks,
On Tue, Jun 22, 2021 at 2:19 AM Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Mon, Jun 21, 2021 at 04:21:40PM +0200, Denys Vlasenko wrote: > > > - } while (!subtype && is_digit(c)); > > > + } while ((subtype <= 0 || subtype >= VSLENGTH) && > > > + is_digit(c)); > > > > ... you use (subtype == 0 || subtype == VSLENGTH) here. > > Also, (subtype == 0 || subtype == VSLENGTH) is less confusing: > > it says "loop if ${VAR} or ${#VAR} syntax", whereas <= >= > > are a bit misleading. > > Yes it looks a bit confusing, but it turns into a single branch > instead of two. Yes, I know that. Compiler turns it into "(unsigned)(x-1) >= VSLENGTH-1" expression. But is it worth the obfuscation? Especially that it also has another downside (it requires an additional free CPU register to hold (x-1) result, which can force compiler to spill other values to stack).
diff --git a/src/parser.c b/src/parser.c index 3c80d17..13c2df5 100644 --- a/src/parser.c +++ b/src/parser.c @@ -1252,7 +1252,8 @@ varname: do { STPUTC(c, out); c = pgetc_eatbnl(); - } while (!subtype && is_digit(c)); + } while ((subtype <= 0 || subtype >= VSLENGTH) && + is_digit(c)); } else if (c != '}') { int cc = c; @@ -1312,6 +1313,8 @@ varname: break; } } else { + if (subtype == VSLENGTH && c != '}') + subtype = 0; badsub: pungetc(); } diff --git a/src/parser.h b/src/parser.h index 524ac1c..7d2749b 100644 --- a/src/parser.h +++ b/src/parser.h @@ -62,6 +62,7 @@ #define VSTRIMLEFT 0x8 /* ${var#pattern} */ #define VSTRIMLEFTMAX 0x9 /* ${var##pattern} */ #define VSLENGTH 0xa /* ${#var} */ +/* VSLENGTH must come last. */ /* values of checkkwd variable */ #define CHKALIAS 0x1