diff mbox series

[RFC] shell: local x=$1 may need to quote the RHS

Message ID xmqqsftc3nd6.fsf@gitster.g (mailing list archive)
State New, archived
Headers show
Series [RFC] shell: local x=$1 may need to quote the RHS | expand

Commit Message

Junio C Hamano Jan. 24, 2022, 8:11 p.m. UTC
Even though this message ends with a patch that could be applied to
your tree, it is not for application at all.  It is to demonstrate a
potential problem in the code in our tree. 

While I was playing with the "Linux development environment (beta)"
on one of my Chromebooks, I say its default /bin/sh (which is Dash
0.5.10) mishandled this construct:

	func () {
		local x=$1
		clobber x and nothing else and feel safe
		message="I expect this to be visible by the caller"
	}

	func "a message"
	use "$message"

It assigned 'a' to $x in the function, DECLARED the varilable
$message as local to the function, hence the caller after func
returned did not see what I intended to see in $message.

The breakage is subtle; unless you have a character in $1 that would
not make a valid variable name, you won't get any error message yet
the program would behave in an unexpected way.

In other words, all of these hits are suspect and may misbehave with
such a shell.

    $ git grep -e '^[	 ]*local.* [a-z0-9_]*=\$' t
    t/lib-parallel-checkout.sh:	local expected_workers=$1 &&
    t/t0000-basic.sh:	local x=$1
    t/t4011-diff-symlink.sh:	local oid=$(printf "%s" "$1" | git hash-object --stdin) &&
    t/t4011-diff-symlink.sh:	local oid=$(git hash-object "$1") &&
    t/t4210-log-i18n.sh:	local engine=$1
    t/t4210-log-i18n.sh:	local pattern=$1
    t/test-lib-functions.sh:	local basename=${1#??}
    t/test-lib-functions.sh:	local var=$1 port
    t/test-lib-functions.sh:	local expr=$(printf '"%s",' "$@")
    t/test-lib-functions.sh:	local expr=$(printf '"%s".*' "$@")

Outside t/, I didn't find anything that may be run with dash and
can be fed problematic input, so as far as I can tell, it is a
problem that only affects the current set of tests.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 t/t0000-basic.sh | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Konstantin Khomoutov Jan. 25, 2022, 9:24 a.m. UTC | #1
On Mon, Jan 24, 2022 at 12:11:33PM -0800, Junio C Hamano wrote:

> Even though this message ends with a patch that could be applied to
> your tree, it is not for application at all.  It is to demonstrate a
> potential problem in the code in our tree. 
> 
> While I was playing with the "Linux development environment (beta)"
> on one of my Chromebooks, I say its default /bin/sh (which is Dash
> 0.5.10) mishandled this construct:
> 
> 	func () {
> 		local x=$1
> 		clobber x and nothing else and feel safe
> 		message="I expect this to be visible by the caller"
> 	}
> 
> 	func "a message"
> 	use "$message"
> 
> It assigned 'a' to $x in the function, DECLARED the varilable
> $message as local to the function, hence the caller after func
> returned did not see what I intended to see in $message.
> 
> The breakage is subtle; unless you have a character in $1 that would
> not make a valid variable name, you won't get any error message yet
> the program would behave in an unexpected way.

Sorry, I might have not followed the entire thread, but assignment in
`local` is a bashism, and dash can only handle the declaration part with
`local`, not assignment; hence the safe code should read

  local x
  x="$1"

To cite the dash manual page:

| Variables may be declared to be local to a function by using a local
| command.  This should appear as the first statement of a function, and the
| syntax is
| 
|    local [variable | -] ...
| 
| Local is implemented as a builtin command.
| 
| When a variable is made local, it inherits the initial value and exported
| and readonly flags from the variable with the same name in the surrounding
| scope, if there is one.  Otherwise, the vari‐ able is initially unset.  The
| shell uses dynamic scoping, so that if you make the variable x lo‐ cal to
| function f, which then calls function g, references to the variable x made
| inside g will refer to the variable x declared inside f, not to the global
| variable named x.
| 
| The only special parameter that can be made local is “-”.  Making “-” local
| any shell options that are changed via the set command inside the function
| to be restored to their original values
| when the function returns.
Junio C Hamano Jan. 25, 2022, 7 p.m. UTC | #2
Konstantin Khomoutov <kostix@bswap.ru> writes:

> Sorry, I might have not followed the entire thread, but assignment in
> `local` is a bashism, and dash can only handle the declaration part with
> `local`, not assignment; hence the safe code should read
>
>   local x
>   x="$1"

Interesting.  As "local" is not in POSIX but we still use it for
convenience, we must limit our use to a reasonable subset of
features common to the shells we care about.  Knowing what each
shell can and cannot do safely is essential to us.

The patch posted seemed to run fine with a more recent dash than
what I had trouble with (0.5.10 would work fine with "$1" quoted,
0.5.11 would work fine without $1, just like the RHS of a regular
assignment).  In addition, there are many existing tests that
already use "local var=initial-value" (the message you are
responding to has an output from "grep") and we haven't got problem
reports from dash users about them.

The manual page for recent dash may need an update.
Can you perhaps file a bug on their documentation?

Thanks.
Taylor Blau Jan. 25, 2022, 8:19 p.m. UTC | #3
On Tue, Jan 25, 2022 at 11:00:39AM -0800, Junio C Hamano wrote:
> Konstantin Khomoutov <kostix@bswap.ru> writes:
>
> > Sorry, I might have not followed the entire thread, but assignment in
> > `local` is a bashism, and dash can only handle the declaration part with
> > `local`, not assignment; hence the safe code should read
> >
> >   local x
> >   x="$1"
>
> Interesting.  As "local" is not in POSIX but we still use it for
> convenience, we must limit our use to a reasonable subset of
> features common to the shells we care about.  Knowing what each
> shell can and cannot do safely is essential to us.
>
> The patch posted seemed to run fine with a more recent dash than
> what I had trouble with (0.5.10 would work fine with "$1" quoted,
> 0.5.11 would work fine without $1, just like the RHS of a regular
> assignment).  In addition, there are many existing tests that
> already use "local var=initial-value" (the message you are
> responding to has an output from "grep") and we haven't got problem
> reports from dash users about them.

Yeah; bisecting dash with your example script pointed me at cbb71a8
(eval: Add assignment built-in support again, 2018-05-19), which indeed
appears in v0.5.11 (and all subsequent versions).

cbb71a8 points at release 0.3.8-15, which predates Git (and a tag
pointing at it was never created, since it's behind the big "initial
import" commit at the beginning of dash.git's history). But skimming
ChangeLog.O, we see:

    * Removed assignment builtins since it is at best undefined by the
      SuS and also can't be implemented consistently.

So this probably didn't work at all between that 0.3.8-15 up until v0.5.11.

> The manual page for recent dash may need an update.
> Can you perhaps file a bug on their documentation?

Yes, I agree that dash.1 is out-of-date after cbb71a8. Konstantin:
please feel free to use any of this if it's helpful to you in creating a
bug report for the dash folks.

Thanks,
Taylor
Junio C Hamano Jan. 26, 2022, 5:53 a.m. UTC | #4
Taylor Blau <me@ttaylorr.com> writes:

> Yeah; bisecting dash with your example script pointed me at cbb71a8
> (eval: Add assignment built-in support again, 2018-05-19), which indeed
> appears in v0.5.11 (and all subsequent versions).
>
> cbb71a8 points at release 0.3.8-15, which predates Git (and a tag
> pointing at it was never created, since it's behind the big "initial
> import" commit at the beginning of dash.git's history). But skimming
> ChangeLog.O, we see:
>
>     * Removed assignment builtins since it is at best undefined by the
>       SuS and also can't be implemented consistently.
>
> So this probably didn't work at all between that 0.3.8-15 up until v0.5.11.
>
>> The manual page for recent dash may need an update.
>> Can you perhaps file a bug on their documentation?
>
> Yes, I agree that dash.1 is out-of-date after cbb71a8. Konstantin:
> please feel free to use any of this if it's helpful to you in creating a
> bug report for the dash folks.

I doubt that we can write off dash v0.5.10 as too old to matter, as
the tagger date seems to be mid 2020 at 

https://kernel.googlesource.com/pub/scm/utils/dash/dash/+/refs/tags/v0.5.11


So here is a bit wider "grep" output, looking for

$ git grep '^[ 	]*local[	 ].*=' \*.sh

to reject _any_ assignment on the same line as "local", but I
manually excluded the ones that are not meant to be run with dash.
I obviously excluded the one in t0000 (try_local_xy) that is a
weather-balloon for this exact issue.

According to this thread, these would not work correctly on dash
older than 0.5.11 and needs fixing by splitting declaration of
variables as locals and assignment of their initial values.


contrib/subtree/git-subtree.sh:	local indent=$(($indent + 1))
contrib/subtree/git-subtree.sh:	local indent=$(($indent + 1))
contrib/subtree/git-subtree.sh:	local indent=$(($indent + 1))
contrib/subtree/git-subtree.sh:	local grep_format="^git-subtree-dir: $dir/*\$"
contrib/subtree/git-subtree.sh:	local rev="$1"
contrib/subtree/git-subtree.sh:	local parents="$2"
contrib/subtree/git-subtree.sh:	local indent=$(($indent + 1))
t/lib-parallel-checkout.sh:	local expected_workers=$1 &&
t/lib-parallel-checkout.sh:	local trace_file=trace-test-checkout-workers &&
t/lib-parallel-checkout.sh:	local workers="$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l)" &&
t/t0021-conversion.sh:		local expect_progress=N &&
t/t0021-conversion.sh:		local expect_progress=
t/t3435-rebase-gpg-sign.sh:	local must_fail= will=will fake_editor=
t/t3514-cherry-pick-revert-gpg.sh:	local must_fail= will=will fake_editor=
t/t4011-diff-symlink.sh:	local oid=$(printf "%s" "$1" | git hash-object --stdin) &&
t/t4011-diff-symlink.sh:	local oid=$(git hash-object "$1") &&
t/t4210-log-i18n.sh:	local engine=$1
t/t4210-log-i18n.sh:	local pattern=$1
t/t6006-rev-list-format.sh:	local args=
t/t6006-rev-list-format.sh:	local args=
t/t9902-completion.sh:	local IFS=$'\n'
t/t9902-completion.sh:	local cur="$1" &&
t/t9902-completion.sh:	local cur="$1" &&
t/t9902-completion.sh:	local cur="$1" expected="$2"
t/test-lib-functions.sh:	local op='=' wrong_result=different
t/test-lib-functions.sh:	local test_cmp_a= test_cmp_b=
t/test-lib-functions.sh:	local stdin_for_diff=
t/test-lib-functions.sh:	local algo="${test_hash_algo}" &&
t/test-lib-functions.sh:	local var="test_oid_${algo}_$1" &&
t/test-lib-functions.sh:	local basename=${1#??}
t/test-lib-functions.sh:	local var=$1 port
t/test-lib-functions.sh:	local negate=
t/test-lib-functions.sh:	local expr=$(printf '"%s",' "$@")
t/test-lib-functions.sh:	local negate=
t/test-lib-functions.sh:	local expr=$(printf '"%s".*' "$@")
t/test-lib-functions.sh:	local expect_exit=0
t/test-lib.sh:	local opt="$1"
t/test-lib.sh:	local bail_out="Bail out! "
t/test-lib.sh:	local message="$1"
Konstantin Khomoutov Jan. 26, 2022, 11:39 a.m. UTC | #5
On Tue, Jan 25, 2022 at 09:53:25PM -0800, Junio C Hamano wrote:

> Taylor Blau <me@ttaylorr.com> writes:
> 
>> Yeah; bisecting dash with your example script pointed me at cbb71a8
>> (eval: Add assignment built-in support again, 2018-05-19), which indeed
>> appears in v0.5.11 (and all subsequent versions).
>>
>> cbb71a8 points at release 0.3.8-15, which predates Git (and a tag
>> pointing at it was never created, since it's behind the big "initial
>> import" commit at the beginning of dash.git's history). But skimming
>> ChangeLog.O, we see:
>>
>>     * Removed assignment builtins since it is at best undefined by the
>>       SuS and also can't be implemented consistently.
>>
>> So this probably didn't work at all between that 0.3.8-15 up until v0.5.11.
[...]
> So here is a bit wider "grep" output, looking for
> 
> $ git grep '^[ 	]*local[	 ].*=' \*.sh
> 
> to reject _any_ assignment on the same line as "local", but I
> manually excluded the ones that are not meant to be run with dash.
> I obviously excluded the one in t0000 (try_local_xy) that is a
> weather-balloon for this exact issue.
> 
> According to this thread, these would not work correctly on dash
> older than 0.5.11 and needs fixing by splitting declaration of
> variables as locals and assignment of their initial values.

By the way, back then when Debian and Ubuntu were switching running their
system scripts from bash to dash (which bought definite speedups), many
scripts had to get rid of bashisms, and this page [1] summarizes quite many
differences between these shells including handing of `local`.
To cite it:

| Dash (old versions maybe?) and (at least) bash do not handle local the same:
|
| -    local a=5 b=6;
| +    local a=5;
| +    local b=6;
|
| The first line in /bin/bash makes a and b local variables. And in /bin/dash
| this line makes b a global variable! 

Not sure it lists a possible problems our test harness also has but wanted to
give a heads-up anyway just in case.

 1. https://wiki.ubuntu.com/DashAsBinSh
SZEDER Gábor Jan. 26, 2022, 9:48 p.m. UTC | #6
On Tue, Jan 25, 2022 at 09:53:25PM -0800, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
> 
> > Yeah; bisecting dash with your example script pointed me at cbb71a8
> > (eval: Add assignment built-in support again, 2018-05-19), which indeed
> > appears in v0.5.11 (and all subsequent versions).
> >
> > cbb71a8 points at release 0.3.8-15, which predates Git (and a tag
> > pointing at it was never created, since it's behind the big "initial
> > import" commit at the beginning of dash.git's history). But skimming
> > ChangeLog.O, we see:
> >
> >     * Removed assignment builtins since it is at best undefined by the
> >       SuS and also can't be implemented consistently.
> >
> > So this probably didn't work at all between that 0.3.8-15 up until v0.5.11.
> >
> >> The manual page for recent dash may need an update.
> >> Can you perhaps file a bug on their documentation?
> >
> > Yes, I agree that dash.1 is out-of-date after cbb71a8. Konstantin:
> > please feel free to use any of this if it's helpful to you in creating a
> > bug report for the dash folks.
> 
> I doubt that we can write off dash v0.5.10 as too old to matter, as
> the tagger date seems to be mid 2020 at 
> 
> https://kernel.googlesource.com/pub/scm/utils/dash/dash/+/refs/tags/v0.5.11

I agree, and v0.5.10 is widely used, as it is the default shell in
Ubuntu 20.04 LTS.

> So here is a bit wider "grep" output, looking for
> 
> $ git grep '^[ 	]*local[	 ].*=' \*.sh
> 
> to reject _any_ assignment on the same line as "local", but I
> manually excluded the ones that are not meant to be run with dash.
> I obviously excluded the one in t0000 (try_local_xy) that is a
> weather-balloon for this exact issue.
> 
> According to this thread, these would not work correctly on dash
> older than 0.5.11 and needs fixing by splitting declaration of
> variables as locals and assignment of their initial values.

I don't think that's necessary.  Whatever that (apparently horribly
outdated) documentation might state, I think what really matters is
what dash actually does.

The patch below puts a bit more strain on our `local` weatherballoon
test:

  - It checks assignments from a parameter expansion and command
    substitution as well, not only simple string assignments.

  - The expanded values contain spaces and are quoted (to avoid the
    field splitting issue that started this thread [1]).

  - Creates a local variable with a name that doesn't hide an already
    existing variable in the global scope, to see whether it becomes
    global.

It succeeds with all tagged dash versions, the oldest being v0.5.2
tagged on 2005-01-31.  I think that's more than old enough to say that
it's not necessary to put local variable declaration and assignment on
different lines, and we can declare multiple local variables on one
line.

Note that NetBSD 8.1's /bin/sh need quotes on the RHS in similar
assignments as well.

[1] We've already run into this issue in ebee5580ca
    (parallel-checkout: avoid dash local bug in tests, 2021-06-06),
    but apparently didn't think through what else might be affected by
    that bug.


diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 03fead95e7..196106fdf5 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -22,8 +22,8 @@ modification *should* take notice and update the test vectors here.
 . "$TEST_DIRECTORY"/lib-subtest.sh
 
 try_local_xy () {
-	local x="local" y="alsolocal" &&
-	echo "$x $y"
+	local x="local" y="${no_such_var-also local}" z="$(echo "third local")" &&
+	echo "$x $y $z"
 }
 
 # Check whether the shell supports the "local" keyword. "local" is not
@@ -37,11 +37,11 @@ try_local_xy () {
 test_expect_success 'verify that the running shell supports "local"' '
 	x="notlocal" &&
 	y="alsonotlocal" &&
-	echo "local alsolocal" >expected1 &&
+	echo "local also local third local" >expected1 &&
 	try_local_xy >actual1 &&
 	test_cmp expected1 actual1 &&
-	echo "notlocal alsonotlocal" >expected2 &&
-	echo "$x $y" >actual2 &&
+	echo "notlocal alsonotlocal z_unset" >expected2 &&
+	echo "$x $y ${z-z_unset}" >actual2 &&
 	test_cmp expected2 actual2
 '
Taylor Blau Jan. 26, 2022, 9:52 p.m. UTC | #7
On Tue, Jan 25, 2022 at 09:53:25PM -0800, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
> >
> >> The manual page for recent dash may need an update.
> >> Can you perhaps file a bug on their documentation?
> >
> > Yes, I agree that dash.1 is out-of-date after cbb71a8. Konstantin:
> > please feel free to use any of this if it's helpful to you in creating a
> > bug report for the dash folks.
>
> I doubt that we can write off dash v0.5.10 as too old to matter, as
> the tagger date seems to be mid 2020 at
>
> https://kernel.googlesource.com/pub/scm/utils/dash/dash/+/refs/tags/v0.5.11

That isn't quite what I was implying. What I meant to say was that the
dash _manual page_ is out-of-date w.r.t. the dash patch I linked, not
that that version is something we could ignore.

Thanks,
Taylor
Junio C Hamano Jan. 27, 2022, 12:17 a.m. UTC | #8
Taylor Blau <me@ttaylorr.com> writes:

> On Tue, Jan 25, 2022 at 09:53:25PM -0800, Junio C Hamano wrote:
>> Taylor Blau <me@ttaylorr.com> writes:
>> >
>> >> The manual page for recent dash may need an update.
>> >> Can you perhaps file a bug on their documentation?
>> >
>> > Yes, I agree that dash.1 is out-of-date after cbb71a8. Konstantin:
>> > please feel free to use any of this if it's helpful to you in creating a
>> > bug report for the dash folks.
>>
>> I doubt that we can write off dash v0.5.10 as too old to matter, as
>> the tagger date seems to be mid 2020 at
>>
>> https://kernel.googlesource.com/pub/scm/utils/dash/dash/+/refs/tags/v0.5.11
>
> That isn't quite what I was implying. What I meant to say was that the
> dash _manual page_ is out-of-date w.r.t. the dash patch I linked, not
> that that version is something we could ignore.

Oh, I didn't mean it that way.  I was continuing from your findings
that certain features of "local" may not have been available in
0.5.10; hence the hits from grep that showed any assignment might be
problematic with that version and we may need to downgrade our
scripts.

But as Szeder writes in a response later in the thread, assignment
on "local" may have been avaiable in versions before 0.5.11 in a
limited form (namely, it is split at $IFS unlike normal assignments,
which was the starting point of this thread), which reduces the
scope of downgrading necessary by a large margin ;-)
diff mbox series

Patch

diff --git c/t/t0000-basic.sh w/t/t0000-basic.sh
index b007f0efef..915eb2384f 100755
--- c/t/t0000-basic.sh
+++ w/t/t0000-basic.sh
@@ -45,6 +45,19 @@  test_expect_success 'verify that the running shell supports "local"' '
 	test_cmp expected2 actual2
 '
 
+try_local_unquoted () {
+	local x=$1
+	y="newvalue"
+}
+
+test_expect_success 'verify that "local x=$1" do not quoting' '
+	y=oldvalue &&
+	echo "newvalue" >expect &&
+	try_local_unquoted "x y" &&
+	echo "$y" >actual &&
+	test_cmp expect actual
+'
+
 ################################################################
 # git init has been done in an empty repository.
 # make sure it is empty.