Message ID | xmqqsftc3nd6.fsf@gitster.g (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] shell: local x=$1 may need to quote the RHS | expand |
On Mon, Jan 24, 2022 at 12:11:33PM -0800, Junio C Hamano wrote: > Even though this message ends with a patch that could be applied to > your tree, it is not for application at all. It is to demonstrate a > potential problem in the code in our tree. > > While I was playing with the "Linux development environment (beta)" > on one of my Chromebooks, I say its default /bin/sh (which is Dash > 0.5.10) mishandled this construct: > > func () { > local x=$1 > clobber x and nothing else and feel safe > message="I expect this to be visible by the caller" > } > > func "a message" > use "$message" > > It assigned 'a' to $x in the function, DECLARED the varilable > $message as local to the function, hence the caller after func > returned did not see what I intended to see in $message. > > The breakage is subtle; unless you have a character in $1 that would > not make a valid variable name, you won't get any error message yet > the program would behave in an unexpected way. Sorry, I might have not followed the entire thread, but assignment in `local` is a bashism, and dash can only handle the declaration part with `local`, not assignment; hence the safe code should read local x x="$1" To cite the dash manual page: | Variables may be declared to be local to a function by using a local | command. This should appear as the first statement of a function, and the | syntax is | | local [variable | -] ... | | Local is implemented as a builtin command. | | When a variable is made local, it inherits the initial value and exported | and readonly flags from the variable with the same name in the surrounding | scope, if there is one. Otherwise, the vari‐ able is initially unset. The | shell uses dynamic scoping, so that if you make the variable x lo‐ cal to | function f, which then calls function g, references to the variable x made | inside g will refer to the variable x declared inside f, not to the global | variable named x. | | The only special parameter that can be made local is “-”. Making “-” local | any shell options that are changed via the set command inside the function | to be restored to their original values | when the function returns.
Konstantin Khomoutov <kostix@bswap.ru> writes: > Sorry, I might have not followed the entire thread, but assignment in > `local` is a bashism, and dash can only handle the declaration part with > `local`, not assignment; hence the safe code should read > > local x > x="$1" Interesting. As "local" is not in POSIX but we still use it for convenience, we must limit our use to a reasonable subset of features common to the shells we care about. Knowing what each shell can and cannot do safely is essential to us. The patch posted seemed to run fine with a more recent dash than what I had trouble with (0.5.10 would work fine with "$1" quoted, 0.5.11 would work fine without $1, just like the RHS of a regular assignment). In addition, there are many existing tests that already use "local var=initial-value" (the message you are responding to has an output from "grep") and we haven't got problem reports from dash users about them. The manual page for recent dash may need an update. Can you perhaps file a bug on their documentation? Thanks.
On Tue, Jan 25, 2022 at 11:00:39AM -0800, Junio C Hamano wrote: > Konstantin Khomoutov <kostix@bswap.ru> writes: > > > Sorry, I might have not followed the entire thread, but assignment in > > `local` is a bashism, and dash can only handle the declaration part with > > `local`, not assignment; hence the safe code should read > > > > local x > > x="$1" > > Interesting. As "local" is not in POSIX but we still use it for > convenience, we must limit our use to a reasonable subset of > features common to the shells we care about. Knowing what each > shell can and cannot do safely is essential to us. > > The patch posted seemed to run fine with a more recent dash than > what I had trouble with (0.5.10 would work fine with "$1" quoted, > 0.5.11 would work fine without $1, just like the RHS of a regular > assignment). In addition, there are many existing tests that > already use "local var=initial-value" (the message you are > responding to has an output from "grep") and we haven't got problem > reports from dash users about them. Yeah; bisecting dash with your example script pointed me at cbb71a8 (eval: Add assignment built-in support again, 2018-05-19), which indeed appears in v0.5.11 (and all subsequent versions). cbb71a8 points at release 0.3.8-15, which predates Git (and a tag pointing at it was never created, since it's behind the big "initial import" commit at the beginning of dash.git's history). But skimming ChangeLog.O, we see: * Removed assignment builtins since it is at best undefined by the SuS and also can't be implemented consistently. So this probably didn't work at all between that 0.3.8-15 up until v0.5.11. > The manual page for recent dash may need an update. > Can you perhaps file a bug on their documentation? Yes, I agree that dash.1 is out-of-date after cbb71a8. Konstantin: please feel free to use any of this if it's helpful to you in creating a bug report for the dash folks. Thanks, Taylor
Taylor Blau <me@ttaylorr.com> writes: > Yeah; bisecting dash with your example script pointed me at cbb71a8 > (eval: Add assignment built-in support again, 2018-05-19), which indeed > appears in v0.5.11 (and all subsequent versions). > > cbb71a8 points at release 0.3.8-15, which predates Git (and a tag > pointing at it was never created, since it's behind the big "initial > import" commit at the beginning of dash.git's history). But skimming > ChangeLog.O, we see: > > * Removed assignment builtins since it is at best undefined by the > SuS and also can't be implemented consistently. > > So this probably didn't work at all between that 0.3.8-15 up until v0.5.11. > >> The manual page for recent dash may need an update. >> Can you perhaps file a bug on their documentation? > > Yes, I agree that dash.1 is out-of-date after cbb71a8. Konstantin: > please feel free to use any of this if it's helpful to you in creating a > bug report for the dash folks. I doubt that we can write off dash v0.5.10 as too old to matter, as the tagger date seems to be mid 2020 at https://kernel.googlesource.com/pub/scm/utils/dash/dash/+/refs/tags/v0.5.11 So here is a bit wider "grep" output, looking for $ git grep '^[ ]*local[ ].*=' \*.sh to reject _any_ assignment on the same line as "local", but I manually excluded the ones that are not meant to be run with dash. I obviously excluded the one in t0000 (try_local_xy) that is a weather-balloon for this exact issue. According to this thread, these would not work correctly on dash older than 0.5.11 and needs fixing by splitting declaration of variables as locals and assignment of their initial values. contrib/subtree/git-subtree.sh: local indent=$(($indent + 1)) contrib/subtree/git-subtree.sh: local indent=$(($indent + 1)) contrib/subtree/git-subtree.sh: local indent=$(($indent + 1)) contrib/subtree/git-subtree.sh: local grep_format="^git-subtree-dir: $dir/*\$" contrib/subtree/git-subtree.sh: local rev="$1" contrib/subtree/git-subtree.sh: local parents="$2" contrib/subtree/git-subtree.sh: local indent=$(($indent + 1)) t/lib-parallel-checkout.sh: local expected_workers=$1 && t/lib-parallel-checkout.sh: local trace_file=trace-test-checkout-workers && t/lib-parallel-checkout.sh: local workers="$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l)" && t/t0021-conversion.sh: local expect_progress=N && t/t0021-conversion.sh: local expect_progress= t/t3435-rebase-gpg-sign.sh: local must_fail= will=will fake_editor= t/t3514-cherry-pick-revert-gpg.sh: local must_fail= will=will fake_editor= t/t4011-diff-symlink.sh: local oid=$(printf "%s" "$1" | git hash-object --stdin) && t/t4011-diff-symlink.sh: local oid=$(git hash-object "$1") && t/t4210-log-i18n.sh: local engine=$1 t/t4210-log-i18n.sh: local pattern=$1 t/t6006-rev-list-format.sh: local args= t/t6006-rev-list-format.sh: local args= t/t9902-completion.sh: local IFS=$'\n' t/t9902-completion.sh: local cur="$1" && t/t9902-completion.sh: local cur="$1" && t/t9902-completion.sh: local cur="$1" expected="$2" t/test-lib-functions.sh: local op='=' wrong_result=different t/test-lib-functions.sh: local test_cmp_a= test_cmp_b= t/test-lib-functions.sh: local stdin_for_diff= t/test-lib-functions.sh: local algo="${test_hash_algo}" && t/test-lib-functions.sh: local var="test_oid_${algo}_$1" && t/test-lib-functions.sh: local basename=${1#??} t/test-lib-functions.sh: local var=$1 port t/test-lib-functions.sh: local negate= t/test-lib-functions.sh: local expr=$(printf '"%s",' "$@") t/test-lib-functions.sh: local negate= t/test-lib-functions.sh: local expr=$(printf '"%s".*' "$@") t/test-lib-functions.sh: local expect_exit=0 t/test-lib.sh: local opt="$1" t/test-lib.sh: local bail_out="Bail out! " t/test-lib.sh: local message="$1"
On Tue, Jan 25, 2022 at 09:53:25PM -0800, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > >> Yeah; bisecting dash with your example script pointed me at cbb71a8 >> (eval: Add assignment built-in support again, 2018-05-19), which indeed >> appears in v0.5.11 (and all subsequent versions). >> >> cbb71a8 points at release 0.3.8-15, which predates Git (and a tag >> pointing at it was never created, since it's behind the big "initial >> import" commit at the beginning of dash.git's history). But skimming >> ChangeLog.O, we see: >> >> * Removed assignment builtins since it is at best undefined by the >> SuS and also can't be implemented consistently. >> >> So this probably didn't work at all between that 0.3.8-15 up until v0.5.11. [...] > So here is a bit wider "grep" output, looking for > > $ git grep '^[ ]*local[ ].*=' \*.sh > > to reject _any_ assignment on the same line as "local", but I > manually excluded the ones that are not meant to be run with dash. > I obviously excluded the one in t0000 (try_local_xy) that is a > weather-balloon for this exact issue. > > According to this thread, these would not work correctly on dash > older than 0.5.11 and needs fixing by splitting declaration of > variables as locals and assignment of their initial values. By the way, back then when Debian and Ubuntu were switching running their system scripts from bash to dash (which bought definite speedups), many scripts had to get rid of bashisms, and this page [1] summarizes quite many differences between these shells including handing of `local`. To cite it: | Dash (old versions maybe?) and (at least) bash do not handle local the same: | | - local a=5 b=6; | + local a=5; | + local b=6; | | The first line in /bin/bash makes a and b local variables. And in /bin/dash | this line makes b a global variable! Not sure it lists a possible problems our test harness also has but wanted to give a heads-up anyway just in case. 1. https://wiki.ubuntu.com/DashAsBinSh
On Tue, Jan 25, 2022 at 09:53:25PM -0800, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > Yeah; bisecting dash with your example script pointed me at cbb71a8 > > (eval: Add assignment built-in support again, 2018-05-19), which indeed > > appears in v0.5.11 (and all subsequent versions). > > > > cbb71a8 points at release 0.3.8-15, which predates Git (and a tag > > pointing at it was never created, since it's behind the big "initial > > import" commit at the beginning of dash.git's history). But skimming > > ChangeLog.O, we see: > > > > * Removed assignment builtins since it is at best undefined by the > > SuS and also can't be implemented consistently. > > > > So this probably didn't work at all between that 0.3.8-15 up until v0.5.11. > > > >> The manual page for recent dash may need an update. > >> Can you perhaps file a bug on their documentation? > > > > Yes, I agree that dash.1 is out-of-date after cbb71a8. Konstantin: > > please feel free to use any of this if it's helpful to you in creating a > > bug report for the dash folks. > > I doubt that we can write off dash v0.5.10 as too old to matter, as > the tagger date seems to be mid 2020 at > > https://kernel.googlesource.com/pub/scm/utils/dash/dash/+/refs/tags/v0.5.11 I agree, and v0.5.10 is widely used, as it is the default shell in Ubuntu 20.04 LTS. > So here is a bit wider "grep" output, looking for > > $ git grep '^[ ]*local[ ].*=' \*.sh > > to reject _any_ assignment on the same line as "local", but I > manually excluded the ones that are not meant to be run with dash. > I obviously excluded the one in t0000 (try_local_xy) that is a > weather-balloon for this exact issue. > > According to this thread, these would not work correctly on dash > older than 0.5.11 and needs fixing by splitting declaration of > variables as locals and assignment of their initial values. I don't think that's necessary. Whatever that (apparently horribly outdated) documentation might state, I think what really matters is what dash actually does. The patch below puts a bit more strain on our `local` weatherballoon test: - It checks assignments from a parameter expansion and command substitution as well, not only simple string assignments. - The expanded values contain spaces and are quoted (to avoid the field splitting issue that started this thread [1]). - Creates a local variable with a name that doesn't hide an already existing variable in the global scope, to see whether it becomes global. It succeeds with all tagged dash versions, the oldest being v0.5.2 tagged on 2005-01-31. I think that's more than old enough to say that it's not necessary to put local variable declaration and assignment on different lines, and we can declare multiple local variables on one line. Note that NetBSD 8.1's /bin/sh need quotes on the RHS in similar assignments as well. [1] We've already run into this issue in ebee5580ca (parallel-checkout: avoid dash local bug in tests, 2021-06-06), but apparently didn't think through what else might be affected by that bug. diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh index 03fead95e7..196106fdf5 100755 --- a/t/t0000-basic.sh +++ b/t/t0000-basic.sh @@ -22,8 +22,8 @@ modification *should* take notice and update the test vectors here. . "$TEST_DIRECTORY"/lib-subtest.sh try_local_xy () { - local x="local" y="alsolocal" && - echo "$x $y" + local x="local" y="${no_such_var-also local}" z="$(echo "third local")" && + echo "$x $y $z" } # Check whether the shell supports the "local" keyword. "local" is not @@ -37,11 +37,11 @@ try_local_xy () { test_expect_success 'verify that the running shell supports "local"' ' x="notlocal" && y="alsonotlocal" && - echo "local alsolocal" >expected1 && + echo "local also local third local" >expected1 && try_local_xy >actual1 && test_cmp expected1 actual1 && - echo "notlocal alsonotlocal" >expected2 && - echo "$x $y" >actual2 && + echo "notlocal alsonotlocal z_unset" >expected2 && + echo "$x $y ${z-z_unset}" >actual2 && test_cmp expected2 actual2 '
On Tue, Jan 25, 2022 at 09:53:25PM -0800, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > >> The manual page for recent dash may need an update. > >> Can you perhaps file a bug on their documentation? > > > > Yes, I agree that dash.1 is out-of-date after cbb71a8. Konstantin: > > please feel free to use any of this if it's helpful to you in creating a > > bug report for the dash folks. > > I doubt that we can write off dash v0.5.10 as too old to matter, as > the tagger date seems to be mid 2020 at > > https://kernel.googlesource.com/pub/scm/utils/dash/dash/+/refs/tags/v0.5.11 That isn't quite what I was implying. What I meant to say was that the dash _manual page_ is out-of-date w.r.t. the dash patch I linked, not that that version is something we could ignore. Thanks, Taylor
Taylor Blau <me@ttaylorr.com> writes: > On Tue, Jan 25, 2022 at 09:53:25PM -0800, Junio C Hamano wrote: >> Taylor Blau <me@ttaylorr.com> writes: >> > >> >> The manual page for recent dash may need an update. >> >> Can you perhaps file a bug on their documentation? >> > >> > Yes, I agree that dash.1 is out-of-date after cbb71a8. Konstantin: >> > please feel free to use any of this if it's helpful to you in creating a >> > bug report for the dash folks. >> >> I doubt that we can write off dash v0.5.10 as too old to matter, as >> the tagger date seems to be mid 2020 at >> >> https://kernel.googlesource.com/pub/scm/utils/dash/dash/+/refs/tags/v0.5.11 > > That isn't quite what I was implying. What I meant to say was that the > dash _manual page_ is out-of-date w.r.t. the dash patch I linked, not > that that version is something we could ignore. Oh, I didn't mean it that way. I was continuing from your findings that certain features of "local" may not have been available in 0.5.10; hence the hits from grep that showed any assignment might be problematic with that version and we may need to downgrade our scripts. But as Szeder writes in a response later in the thread, assignment on "local" may have been avaiable in versions before 0.5.11 in a limited form (namely, it is split at $IFS unlike normal assignments, which was the starting point of this thread), which reduces the scope of downgrading necessary by a large margin ;-)
diff --git c/t/t0000-basic.sh w/t/t0000-basic.sh index b007f0efef..915eb2384f 100755 --- c/t/t0000-basic.sh +++ w/t/t0000-basic.sh @@ -45,6 +45,19 @@ test_expect_success 'verify that the running shell supports "local"' ' test_cmp expected2 actual2 ' +try_local_unquoted () { + local x=$1 + y="newvalue" +} + +test_expect_success 'verify that "local x=$1" do not quoting' ' + y=oldvalue && + echo "newvalue" >expect && + try_local_unquoted "x y" && + echo "$y" >actual && + test_cmp expect actual +' + ################################################################ # git init has been done in an empty repository. # make sure it is empty.
Even though this message ends with a patch that could be applied to your tree, it is not for application at all. It is to demonstrate a potential problem in the code in our tree. While I was playing with the "Linux development environment (beta)" on one of my Chromebooks, I say its default /bin/sh (which is Dash 0.5.10) mishandled this construct: func () { local x=$1 clobber x and nothing else and feel safe message="I expect this to be visible by the caller" } func "a message" use "$message" It assigned 'a' to $x in the function, DECLARED the varilable $message as local to the function, hence the caller after func returned did not see what I intended to see in $message. The breakage is subtle; unless you have a character in $1 that would not make a valid variable name, you won't get any error message yet the program would behave in an unexpected way. In other words, all of these hits are suspect and may misbehave with such a shell. $ git grep -e '^[ ]*local.* [a-z0-9_]*=\$' t t/lib-parallel-checkout.sh: local expected_workers=$1 && t/t0000-basic.sh: local x=$1 t/t4011-diff-symlink.sh: local oid=$(printf "%s" "$1" | git hash-object --stdin) && t/t4011-diff-symlink.sh: local oid=$(git hash-object "$1") && t/t4210-log-i18n.sh: local engine=$1 t/t4210-log-i18n.sh: local pattern=$1 t/test-lib-functions.sh: local basename=${1#??} t/test-lib-functions.sh: local var=$1 port t/test-lib-functions.sh: local expr=$(printf '"%s",' "$@") t/test-lib-functions.sh: local expr=$(printf '"%s".*' "$@") Outside t/, I didn't find anything that may be run with dash and can be fed problematic input, so as far as I can tell, it is a problem that only affects the current set of tests. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- t/t0000-basic.sh | 13 +++++++++++++ 1 file changed, 13 insertions(+)