diff mbox

don't record empty IFS scan regions

Message ID 72d906db-0734-cef7-e941-2c306835974a@inlv.org (mailing list archive)
State Superseded
Delegated to: Herbert Xu
Headers show

Commit Message

Martijn Dekker March 22, 2018, 9:38 p.m. UTC
Op 22-03-18 om 20:28 schreef Harald van Dijk:
> On 22/03/2018 03:40, Martijn Dekker wrote:
>> evalvar() records empty expansion results (varlen == 0) as string
>> regions that need to be scanned for IFS characters. This is pointless,
>> because there is nothing to split.
> 
> varlen may be set to -1 when an unset variable is expanded. If it's
> beneficial to avoid recording empty regions to scan for IFS characters,
> should those also be excluded?

Agreed. Updated patch attached.

>> This patch fixes the bug that, given no positional parameters, unquoted
>> $@ and $* incorrectly generate one empty field (they should generate no
>> fields). Apparently that was a side effect of the above.
> 
> This seems weird though. If you want to remove the recording of empty
> regions because they are pointless, then how does removing them fix a
> bug? Doesn't this show that empty regions do have an effect? Perhaps
> they're not supposed to have any effect, perhaps it's a specific
> combination of empty regions and something else that triggers some bug,
> and perhaps that combination can no longer occur with your patch.

The latter is my guess, but I haven't had time to investigate it.

I *did* investigate whether the patch introduces other bugs, and haven't
found any new ones.

I successfully ran this large IFS and PPs test script from AT&T against
dash with this patch:

http://web.archive.org/web/20050414022354/http://www.research.att.com/~gsf/public/ifs.sh

My own modernish regression test suite, which tests some pretty crazy
things including lots of IFS and positional parameters stuff, also
passes completely:

git clone https://github.com/modernish/modernish
cd modernish
dash bin/modernish --test
(still under development, may occasionally break)

- M.

Comments

Harald van Dijk March 22, 2018, 10:16 p.m. UTC | #1
On 22/03/2018 22:38, Martijn Dekker wrote:
> Op 22-03-18 om 20:28 schreef Harald van Dijk:
>> On 22/03/2018 03:40, Martijn Dekker wrote:
>>> This patch fixes the bug that, given no positional parameters, unquoted
>>> $@ and $* incorrectly generate one empty field (they should generate no
>>> fields). Apparently that was a side effect of the above.
>>
>> This seems weird though. If you want to remove the recording of empty
>> regions because they are pointless, then how does removing them fix a
>> bug? Doesn't this show that empty regions do have an effect? Perhaps
>> they're not supposed to have any effect, perhaps it's a specific
>> combination of empty regions and something else that triggers some bug,
>> and perhaps that combination can no longer occur with your patch.
> 
> The latter is my guess, but I haven't had time to investigate it.

Looking into it again:

When IFS is set to an empty string, sepc is set to '\0' in varvalue(). 
This then causes *quotedp to be set to true, meaning evalvar()'s quoted 
variable is turned on. quoted is then passed to recordregion() as the 
nulonly parameter.

ifsp->nulonly has a bigger effect than merely selecting whether to use 
$IFS or whether to only split on null bytes: in ifsbreakup(), nulonly 
also causes string termination to be suppressed. That's correct: that 
special treatment is required to preserve empty fields in "$@" 
expansion. But it should *only* be used when $@ is quoted: ifsbreakup() 
takes nulonly from the last IFS region, even if it's empty, so having an 
additional zero-length region with nulonly enabled causes confusion.

Passing quoted by value to varvalue() and not attempting to modify it 
should therefore, and in my quick testing does, also work to fix the 
original $@ bug.

Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martijn Dekker March 22, 2018, 10:28 p.m. UTC | #2
Op 22-03-18 om 23:16 schreef Harald van Dijk:
> Passing quoted by value to varvalue() and not attempting to modify it
> should therefore, and in my quick testing does, also work to fix the
> original $@ bug.

Then maybe both that and not recording empty split regions should be done.

My C-fu may not be all that strong but I think my reasoning for my patch
makes straightforward logical sense based on the comment description in
expand.c of what recordregion() does:
/*
 * Record the fact that we have to scan this region of the
 * string for IFS characters.
 */
This is pointless if there is nothing to scan.

Thus, making both changes should fix the bug while implementing a slight
optimisation and increasing code integrity.

(Speaking of code integrity, IMO that mess of gotos is in dire need of
refactoring. But I'm not the person to do that.)

- M.
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/src/expand.c b/src/expand.c
index 705fef7..c3d20a4 100644
--- a/src/expand.c
+++ b/src/expand.c
@@ -771,7 +771,7 @@  vsplus:
 
 	if (subtype == VSNORMAL) {
 record:
-		if (!easy)
+		if (!easy || varlen <= 0)
 			goto end;
 		recordregion(startloc, expdest - (char *)stackblock(), quoted);
 		goto end;