diff mbox series

[v3] parser: don't keep alloca()ing in a loop for substitutions

Message ID 20230105134204.e7mczfzt2vbit3oc@tarta.nabijaczleweli.xyz (mailing list archive)
State Accepted
Delegated to: Herbert Xu
Headers show
Series [v3] parser: don't keep alloca()ing in a loop for substitutions | expand

Commit Message

наб Jan. 5, 2023, 1:42 p.m. UTC
When encountering
  printf %010000d | tr 0 \` | sh -n
  printf %09999d  | tr 0 \` | sh -n
you want no output and "Syntax error: EOF in backquote substitution",
respectively; instead, current dash segfaults.

This is because the alloca for the save buffer is run, naturally,
in the same function, so first it allocates one byte, then two,
then ..., then appx. 4000 (for me, depends on the binary),
then it segfaults on the memcpy (it's even worse, since due to
alignment, it usually allocates much more for the early stuff).

Nevertheless, the stack frame grows unboundedly, until we completely
destroy the stack. Instead of squirreling the out block away, then
letting subsequent allocations override the original, mark it used,
and just re-copy it to the top of the dash stack. This increases peak
memory usage somewhat
(in the most pathological case ‒ the above but with three nines ‒
 from 23.26 to 173.7KiB according to massif,
 in parsing a regular program (ratrun from ratrun 0c)
 from 28.68 to 29.19;
 a simpler program (ibid., rat) stays at 5.422;
 parsing libtoolize, debootstrap, and dkms
 (the biggest shell programs in my /[s]bin by size + by `/$( count)
 likewise stay the same at 12.02, 41.48, and 6.438)
but it's barely measurable outside of truly pathological conditions
that were a step away from a segfault previously.

Fixes: https://bugs.debian.org/966156
---
I think this means we also need to turn the USTPUTC() into STPUTC(),
since the previous code explicitly over-accounted for it in growstackto().

 src/parser.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

Comments

Herbert Xu Jan. 8, 2023, 12:07 p.m. UTC | #1
On Thu, Jan 05, 2023 at 02:42:04PM +0100, наб wrote:
> When encountering
>   printf %010000d | tr 0 \` | sh -n
>   printf %09999d  | tr 0 \` | sh -n
> you want no output and "Syntax error: EOF in backquote substitution",
> respectively; instead, current dash segfaults.
> 
> This is because the alloca for the save buffer is run, naturally,
> in the same function, so first it allocates one byte, then two,
> then ..., then appx. 4000 (for me, depends on the binary),
> then it segfaults on the memcpy (it's even worse, since due to
> alignment, it usually allocates much more for the early stuff).
> 
> Nevertheless, the stack frame grows unboundedly, until we completely
> destroy the stack. Instead of squirreling the out block away, then
> letting subsequent allocations override the original, mark it used,
> and just re-copy it to the top of the dash stack. This increases peak
> memory usage somewhat
> (in the most pathological case ‒ the above but with three nines ‒
>  from 23.26 to 173.7KiB according to massif,
>  in parsing a regular program (ratrun from ratrun 0c)
>  from 28.68 to 29.19;
>  a simpler program (ibid., rat) stays at 5.422;
>  parsing libtoolize, debootstrap, and dkms
>  (the biggest shell programs in my /[s]bin by size + by `/$( count)
>  likewise stay the same at 12.02, 41.48, and 6.438)
> but it's barely measurable outside of truly pathological conditions
> that were a step away from a segfault previously.
> 
> Fixes: https://bugs.debian.org/966156
> ---
> I think this means we also need to turn the USTPUTC() into STPUTC(),
> since the previous code explicitly over-accounted for it in growstackto().
> 
>  src/parser.c | 15 ++++-----------
>  1 file changed, 4 insertions(+), 11 deletions(-)

Patch applied.  Thanks.
diff mbox series

Patch

diff --git a/src/parser.c b/src/parser.c
index 8a06b9e..f5f76d5 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -1360,12 +1360,9 @@  parsebackq: {
 	struct heredoc *saveheredoclist;
 	int uninitialized_var(saveprompt);
 
-	str = NULL;
+	str = stackblock();
 	savelen = out - (char *)stackblock();
-	if (savelen > 0) {
-		str = alloca(savelen);
-		memcpy(str, stackblock(), savelen);
-	}
+	grabstackblock(savelen);
         if (oldstyle) {
                 /* We must read until the closing backquote, giving special
                    treatment to some slashes, and then push the string and
@@ -1445,12 +1442,8 @@  done:
 	/* Ignore any pushed back tokens left from the backquote parsing. */
 	if (oldstyle)
 		tokpushback = 0;
-	out = growstackto(savelen + 1);
-	if (str) {
-		memcpy(out, str, savelen);
-		STADJUST(savelen, out);
-	}
-	USTPUTC(CTLBACKQ, out);
+	out = stnputs(str, savelen, stackblock());
+	STPUTC(CTLBACKQ, out);
 	if (oldstyle)
 		goto parsebackq_oldreturn;
 	else