diff mbox

fix "Illegal number" on FreeBSD & macOS for x=; echo $((x))

Message ID 62c16598-2d44-f4c9-991d-9880a8839786@inlv.org (mailing list archive)
State Changes Requested
Delegated to: Herbert Xu
Headers show

Commit Message

Martijn Dekker March 5, 2018, 12:32 a.m. UTC
dash compiled on Mac OS X (macOS) and FreeBSD manifests the following bug:

$ dash -c 'x=; echo $((x))'
dash: 1: Illegal number:

This error is printed instead of the expected default "0" that should be
substituted for an empty variable in an arithmetic expression.

The bug does not occur on Linux, NetBSD, OpenBSD or Solaris.

I traced the problem to the fact that strtoimax(3) on macOS and FreeBSD
returns EINVAL in errno for an empty string -- unlike those other
systems, which return 0 in errno for an empty string.

POSIX says of strtoimax(3):
http://pubs.opengroup.org/onlinepubs/9699919799/functions/strtoimax.html
| These functions may fail if:
|
| [EINVAL]
|     No conversion could be performed.

It seems reasonable to consider that "no conversion could be performed"
if the string to convert is empty. Returning EINVAL if no conversion
could be performed is optional ("may fail"). So it seems to me that both
the FreeBSD/macOS behaviour and that of the other systems is POSIX
compliant.

The following patch should eliminate the bug on FreeBSD, macOS and any
other POSIX system which may act the same, without affecting behaviour
on other systems.

- M.

--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Harald van Dijk March 5, 2018, 10:41 p.m. UTC | #1
On 3/5/18 1:32 AM, Martijn Dekker wrote:
> dash compiled on Mac OS X (macOS) and FreeBSD manifests the following bug:
> 
> $ dash -c 'x=; echo $((x))'
> dash: 1: Illegal number:
> 
> This error is printed instead of the expected default "0" that should be
> substituted for an empty variable in an arithmetic expression.
> 
> The bug does not occur on Linux, NetBSD, OpenBSD or Solaris.

There is no reason why dash should behave differently on FreeBSD vs. 
Linux, so I agree with your patch, but isn't this non-standard, isn't 
either behaviour allowed?

dash on Linux also accepts 'x=\ ; echo $((x))' to print 0, and your 
patch also lets that work on FreeBSD.

> I traced the problem to the fact that strtoimax(3) on macOS and FreeBSD
> returns EINVAL in errno for an empty string -- unlike those other
> systems, which return 0 in errno for an empty string.
> 
> POSIX says of strtoimax(3):
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/strtoimax.html
> | These functions may fail if:
> |
> | [EINVAL]
> |     No conversion could be performed.
> 
> It seems reasonable to consider that "no conversion could be performed"
> if the string to convert is empty. Returning EINVAL if no conversion
> could be performed is optional ("may fail"). So it seems to me that both
> the FreeBSD/macOS behaviour and that of the other systems is POSIX
> compliant.
> 
> The following patch should eliminate the bug on FreeBSD, macOS and any
> other POSIX system which may act the same, without affecting behaviour
> on other systems.
> 
> - M.
> 
> diff --git a/src/mystring.c b/src/mystring.c
> index 0106bd2..c7df41f 100644
> --- a/src/mystring.c
> +++ b/src/mystring.c
> @@ -125,7 +125,7 @@ intmax_t atomax(const char *s, int base)
>   	errno = 0;
>   	r = strtoimax(s, &p, base);
> 
> -	if (errno != 0)
> +	if (errno != 0 && !(errno == EINVAL && p == s))

This looks good, it does the job, but it can be simplified a bit:

If errno == EINVAL, then p == s is already known.

If errno != 0 && p == s, then errno == EINVAL is already known.

This allows simplifying to either of the following:

   if (errno != 0 && errno != EINVAL)
   if (errno != 0 && p != s)

But perhaps

   if (errno == ERANGE)

is all that's needed here.

Cheers,
Harald van Dijk

>   		badnum(s);
> 
>   	/*
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martijn Dekker March 5, 2018, 11:23 p.m. UTC | #2
Op 05-03-18 om 22:41 schreef Harald van Dijk:
> On 3/5/18 1:32 AM, Martijn Dekker wrote:
>> dash compiled on Mac OS X (macOS) and FreeBSD manifests the following
>> bug:
>>
>> $ dash -c 'x=; echo $((x))'
>> dash: 1: Illegal number:
>>
>> This error is printed instead of the expected default "0" that should be
>> substituted for an empty variable in an arithmetic expression.
>>
>> The bug does not occur on Linux, NetBSD, OpenBSD or Solaris.
> 
> There is no reason why dash should behave differently on FreeBSD vs.
> Linux, so I agree with your patch, but isn't this non-standard, isn't
> either behaviour allowed?

I don't think so.

2.6.4 Arithmetic Expansion:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_04
"The arithmetic expression shall be processed according to the rules
given in Arithmetic Precision and Operations [...]".

Arithmetic Precision and Operations:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html#tag_17_01_02_01
"All variables shall be initialized to zero if they are not otherwise
assigned by the input to the application."

Also, consensus among existing shells appears to be universal.

> This looks good, it does the job, but it can be simplified a bit:
> 
> If errno == EINVAL, then p == s is already known.
> 
> If errno != 0 && p == s, then errno == EINVAL is already known.
> 
> This allows simplifying to either of the following:
> 
>   if (errno != 0 && errno != EINVAL)
>   if (errno != 0 && p != s)

Good point.

> But perhaps
> 
>   if (errno == ERANGE)
> 
> is all that's needed here.

Explain?

Thanks,

- M.
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Harald van Dijk March 6, 2018, 12:23 a.m. UTC | #3
On 3/6/18 12:23 AM, Martijn Dekker wrote:
> Op 05-03-18 om 22:41 schreef Harald van Dijk:
>> On 3/5/18 1:32 AM, Martijn Dekker wrote:
>>> dash compiled on Mac OS X (macOS) and FreeBSD manifests the following
>>> bug:
>>>
>>> $ dash -c 'x=; echo $((x))'
>>> dash: 1: Illegal number:
>>>
>>> This error is printed instead of the expected default "0" that should be
>>> substituted for an empty variable in an arithmetic expression.
>>>
>>> The bug does not occur on Linux, NetBSD, OpenBSD or Solaris.
>>
>> There is no reason why dash should behave differently on FreeBSD vs.
>> Linux, so I agree with your patch, but isn't this non-standard, isn't
>> either behaviour allowed?
> 
> I don't think so.
> 
> 2.6.4 Arithmetic Expansion:
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_04
> "The arithmetic expression shall be processed according to the rules
> given in Arithmetic Precision and Operations [...]".
> 
> Arithmetic Precision and Operations:
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html#tag_17_01_02_01
> "All variables shall be initialized to zero if they are not otherwise
> assigned by the input to the application."

In your example, x has been assigned. Its value is empty, but empty and 
unset are two different things.

But I see now that dash gives the same error when x is unset. That part 
is definitely wrong for exactly the reason you point out.

> Also, consensus among existing shells appears to be universal.

Mostly, but not fully. yash is the exception:

   $ yash -c 'unset x; echo $((x))'
   0
   $ yash -c 'x=; echo $((x))'


yash is pretty special when it comes to shell arithmetic:

   $ yash -c 'x=abc; echo $((x))'
   abc
   $ yash -c 'x=abc; echo $((x+0))'
   yash: arithmetic: `abc' is not a valid number

>> This looks good, it does the job, but it can be simplified a bit:
>>
>> If errno == EINVAL, then p == s is already known.
>>
>> If errno != 0 && p == s, then errno == EINVAL is already known.
>>
>> This allows simplifying to either of the following:
>>
>>    if (errno != 0 && errno != EINVAL)
>>    if (errno != 0 && p != s)
> 
> Good point.
> 
>> But perhaps
>>
>>    if (errno == ERANGE)
>>
>> is all that's needed here.
> 
> Explain?

Since base is always a constant 0 or a constant 10, never a 
user-provided value, the only error that strtoimax will ever report on 
glibc systems is ERANGE. Checking only ERANGE therefore preserves the 
glibc behaviour, and allows the exact same set of errors to be detected 
on non-glibc systems.

Cheers,
Harald van Dijk

> Thanks,
> 
> - M.
> 
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martijn Dekker March 6, 2018, 1:15 a.m. UTC | #4
Op 06-03-18 om 00:23 schreef Harald van Dijk:
> On 3/6/18 12:23 AM, Martijn Dekker wrote:
>> "All variables shall be initialized to zero if they are not otherwise
>> assigned by the input to the application."
> 
> In your example, x has been assigned. Its value is empty, but empty and
> unset are two different things.
> 
> But I see now that dash gives the same error when x is unset. That part
> is definitely wrong for exactly the reason you point out.

By default, all unset variables are considered empty, so they should act
the same.

That brings me to another possible issue, though:

$ dash -uc 'unset x; echo $((x))'
0

Shouldn't that give an "parameter not set" error under set -u?
Interestingly, only bash and AT&T ksh93 currently do that.


>> Also, consensus among existing shells appears to be universal.
> 
> Mostly, but not fully. yash is the exception:
> 
>   $ yash -c 'unset x; echo $((x))'
>   0
>   $ yash -c 'x=; echo $((x))'
> 


But in POSIX mode it acts like other shells:

$ yash -o posix -c 'x=; echo $((x))'
0


> yash is pretty special when it comes to shell arithmetic:
> 
>   $ yash -c 'x=abc; echo $((x))'
>   abc
>   $ yash -c 'x=abc; echo $((x+0))'
>   yash: arithmetic: `abc' is not a valid number


$ yash -o posix -c 'x=abc; echo $((x))'
yash: arithmetic: `abc' is not a valid number


>>> But perhaps
>>>
>>>    if (errno == ERANGE)
>>>
>>> is all that's needed here.
>>
>> Explain?
> 
> Since base is always a constant 0 or a constant 10, never a
> user-provided value, the only error that strtoimax will ever report on
> glibc systems is ERANGE. Checking only ERANGE therefore preserves the
> glibc behaviour, and allows the exact same set of errors to be detected
> on non-glibc systems.


That makes sense, thanks.

- M.
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Harald van Dijk March 6, 2018, 2:50 a.m. UTC | #5
On 3/6/18 2:15 AM, Martijn Dekker wrote:
> Op 06-03-18 om 00:23 schreef Harald van Dijk:
>> On 3/6/18 12:23 AM, Martijn Dekker wrote:
>>> "All variables shall be initialized to zero if they are not otherwise
>>> assigned by the input to the application."
>>
>> In your example, x has been assigned. Its value is empty, but empty and
>> unset are two different things.
>>
>> But I see now that dash gives the same error when x is unset. That part
>> is definitely wrong for exactly the reason you point out.
> 
> By default, all unset variables are considered empty, so they should act
> the same.

Except where special behaviour is specified for unset variables, which I 
believed to be the case here. But see below.

> That brings me to another possible issue, though:
> 
> $ dash -uc 'unset x; echo $((x))'
> 0
> 
> Shouldn't that give an "parameter not set" error under set -u?
> Interestingly, only bash and AT&T ksh93 currently do that.

This is <http://austingroupbugs.net/view.php?id=559>. It's unspecified 
for now, but will likely become an error in the future.

>>> Also, consensus among existing shells appears to be universal.
>>
>> Mostly, but not fully. yash is the exception:
>>
>>    $ yash -c 'unset x; echo $((x))'
>>    0
>>    $ yash -c 'x=; echo $((x))'
>>
> 
> 
> But in POSIX mode it acts like other shells:
> 
> $ yash -o posix -c 'x=; echo $((x))'
> 0

I stand corrected! Indeed, then it does appear that all shells are in 
agreement. And when POSIX is (arguably) unclear but all shells are in 
agreement, the shells' behaviour should generally just be taken as the 
correct behaviour.

Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu March 7, 2018, 6:26 a.m. UTC | #6
Martijn Dekker <martijn@inlv.org> wrote:
>
>> Since base is always a constant 0 or a constant 10, never a
>> user-provided value, the only error that strtoimax will ever report on
>> glibc systems is ERANGE. Checking only ERANGE therefore preserves the
>> glibc behaviour, and allows the exact same set of errors to be detected
>> on non-glibc systems.
> 
> That makes sense, thanks.

Could you resend your patch with this change please?

Thanks,
diff mbox

Patch

diff --git a/src/mystring.c b/src/mystring.c
index 0106bd2..c7df41f 100644
--- a/src/mystring.c
+++ b/src/mystring.c
@@ -125,7 +125,7 @@  intmax_t atomax(const char *s, int base)
 	errno = 0;
 	r = strtoimax(s, &p, base);

-	if (errno != 0)
+	if (errno != 0 && !(errno == EINVAL && p == s))
 		badnum(s);

 	/*