diff mbox series

Fix error: array subscript has type 'char'

Message ID 20210112181242.1570-1-bouyer@antioche.eu.org (mailing list archive)
State New, archived
Headers show
Series Fix error: array subscript has type 'char' | expand

Commit Message

Manuel Bouyer Jan. 12, 2021, 6:12 p.m. UTC
From: Manuel Bouyer <bouyer@netbsd.org>

Use unsigned char variable, or cast to (unsigned char), for
tolower()/islower() and friends. Fix compiler error
array subscript has type 'char' [-Werror=char-subscripts]

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
---
 tools/libs/light/libxl_qmp.c | 2 +-
 tools/xentrace/xentrace.c    | 2 +-
 xen/tools/symbols.c          | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

Comments

Jan Beulich Jan. 14, 2021, 10:53 a.m. UTC | #1
On 12.01.2021 19:12, Manuel Bouyer wrote:
> From: Manuel Bouyer <bouyer@netbsd.org>
> 
> Use unsigned char variable, or cast to (unsigned char), for
> tolower()/islower() and friends. Fix compiler error
> array subscript has type 'char' [-Werror=char-subscripts]

Isn't this something that wants changing in your ctype.h instead?
the functions (or macros), as per the C standard, ought to accept
plain char aiui, and if they use the input as an array subscript,
it should be their implementation suitably converting type first.

Jan
Manuel Bouyer Jan. 14, 2021, 12:29 p.m. UTC | #2
On Thu, Jan 14, 2021 at 11:53:20AM +0100, Jan Beulich wrote:
> On 12.01.2021 19:12, Manuel Bouyer wrote:
> > From: Manuel Bouyer <bouyer@netbsd.org>
> > 
> > Use unsigned char variable, or cast to (unsigned char), for
> > tolower()/islower() and friends. Fix compiler error
> > array subscript has type 'char' [-Werror=char-subscripts]
> 
> Isn't this something that wants changing in your ctype.h instead?
> the functions (or macros), as per the C standard, ought to accept
> plain char aiui, and if they use the input as an array subscript,
> it should be their implementation suitably converting type first.

I asked for inputs from NetBSD developers familiar with this part.

Although the parameter is an int, only a subset of values is valid,
as stated in ISO C 2018 (Section 7.4 paragrah 1):
> In all cases the argument is an int, the value of which shall be
> representable as an unsigned char or shall equal the value of the
> macro EOF.  If the argument has any other value, the behavior is 
> undefined.                               


As stated by NetBSD's ctype(3) manual page, NetBSD and glibc took different
approach. NetBSD emits a compile-time warning if the input may lead to
undefined behavior. quoting the man page:
     Some implementations of libc, such as glibc as of 2018, attempt to avoid
     the worst of the undefined behavior by defining the functions to work for
     all integer inputs representable by either unsigned char or char, and
     suppress the warning.  However, this is not an excuse for avoiding
     conversion to unsigned char: if EOF coincides with any such value, as it
     does when it is -1 on platforms with signed char, programs that pass char
     will still necessarily confuse the classification and mapping of EOF with
     the classification and mapping of some non-EOF inputs.


So, although no warning is emmited on linux, it looks like to me that the
cast to unsigned char is needed anyway, and relying on glibc's behavior
is not portable.
Jan Beulich Jan. 14, 2021, 1:25 p.m. UTC | #3
On 14.01.2021 13:29, Manuel Bouyer wrote:
> On Thu, Jan 14, 2021 at 11:53:20AM +0100, Jan Beulich wrote:
>> On 12.01.2021 19:12, Manuel Bouyer wrote:
>>> From: Manuel Bouyer <bouyer@netbsd.org>
>>>
>>> Use unsigned char variable, or cast to (unsigned char), for
>>> tolower()/islower() and friends. Fix compiler error
>>> array subscript has type 'char' [-Werror=char-subscripts]
>>
>> Isn't this something that wants changing in your ctype.h instead?
>> the functions (or macros), as per the C standard, ought to accept
>> plain char aiui, and if they use the input as an array subscript,
>> it should be their implementation suitably converting type first.
> 
> I asked for inputs from NetBSD developers familiar with this part.
> 
> Although the parameter is an int, only a subset of values is valid,
> as stated in ISO C 2018 (Section 7.4 paragrah 1):
>> In all cases the argument is an int, the value of which shall be
>> representable as an unsigned char or shall equal the value of the
>> macro EOF.  If the argument has any other value, the behavior is 
>> undefined.                               

Which means you're shifting the undefined-ness from the implementation
(using the value as array index) to the callers (truncating values, or
converting value range). In particular I do not think that the
intention behind the standard's wording is for every caller to need to
cast to unsigned char, when e.g. character literals are of type char
and string literals are of type const char[].

> As stated by NetBSD's ctype(3) manual page, NetBSD and glibc took different
> approach. NetBSD emits a compile-time warning if the input may lead to
> undefined behavior. quoting the man page:
>      Some implementations of libc, such as glibc as of 2018, attempt to avoid
>      the worst of the undefined behavior by defining the functions to work for
>      all integer inputs representable by either unsigned char or char, and
>      suppress the warning.  However, this is not an excuse for avoiding
>      conversion to unsigned char: if EOF coincides with any such value, as it
>      does when it is -1 on platforms with signed char, programs that pass char
>      will still necessarily confuse the classification and mapping of EOF with
>      the classification and mapping of some non-EOF inputs.
> 
> 
> So, although no warning is emmited on linux, it looks like to me that the
> cast to unsigned char is needed anyway, and relying on glibc's behavior
> is not portable.

I fully agree we shouldn't rely on glibc's behavior (I'm sure
you've looked at xen/include/xen/ctype.h to see how we address
this it Xen itself, but I will admit this is to a degree comparing
apples and oranges, not the least because the lack of a need to
consider EOF in Xen). At least in xen/tools/symbols.c I don't
think we're at risk of running into "undefined" space. Casts are
generally desirable to avoid whenever possible, as they come with
their own set of risks. Granted this is "only" user space code,
but still.

Jan
Manuel Bouyer Jan. 14, 2021, 2:16 p.m. UTC | #4
On Thu, Jan 14, 2021 at 02:25:05PM +0100, Jan Beulich wrote:
> On 14.01.2021 13:29, Manuel Bouyer wrote:
> > On Thu, Jan 14, 2021 at 11:53:20AM +0100, Jan Beulich wrote:
> >> On 12.01.2021 19:12, Manuel Bouyer wrote:
> >>> From: Manuel Bouyer <bouyer@netbsd.org>
> >>>
> >>> Use unsigned char variable, or cast to (unsigned char), for
> >>> tolower()/islower() and friends. Fix compiler error
> >>> array subscript has type 'char' [-Werror=char-subscripts]
> >>
> >> Isn't this something that wants changing in your ctype.h instead?
> >> the functions (or macros), as per the C standard, ought to accept
> >> plain char aiui, and if they use the input as an array subscript,
> >> it should be their implementation suitably converting type first.
> > 
> > I asked for inputs from NetBSD developers familiar with this part.
> > 
> > Although the parameter is an int, only a subset of values is valid,
> > as stated in ISO C 2018 (Section 7.4 paragrah 1):
> >> In all cases the argument is an int, the value of which shall be
> >> representable as an unsigned char or shall equal the value of the
> >> macro EOF.  If the argument has any other value, the behavior is 
> >> undefined.                               
> 
> Which means you're shifting the undefined-ness from the implementation
> (using the value as array index) to the callers (truncating values, or
> converting value range). In particular I do not think that the
> intention behind the standard's wording is for every caller to need to
> cast to unsigned char, when e.g. character literals are of type char
> and string literals are of type const char[].

If you don't cast you fall into the undefined behavior case for non-ascii
characters. For example, "é" in iso-latin-1 is 0xe9. In a signed char, this is
-23 (decimal). without the cast, tolower() will see -23.
If it is casted to (unsigned char) first, tolower() will see 233, as expected.
The following test program illustrates this:
armandeche:/tmp>cat toto.c
#include <stdio.h>

int
main(int _c, const char *_v[])
{
        char c = 0xe9;
	printf("%d %d\n", (int)c, (int)(unsigned char)c);
}
armandeche:/tmp>./toto 
-23 233



> 
> > As stated by NetBSD's ctype(3) manual page, NetBSD and glibc took different
> > approach. NetBSD emits a compile-time warning if the input may lead to
> > undefined behavior. quoting the man page:
> >      Some implementations of libc, such as glibc as of 2018, attempt to avoid
> >      the worst of the undefined behavior by defining the functions to work for
> >      all integer inputs representable by either unsigned char or char, and
> >      suppress the warning.  However, this is not an excuse for avoiding
> >      conversion to unsigned char: if EOF coincides with any such value, as it
> >      does when it is -1 on platforms with signed char, programs that pass char
> >      will still necessarily confuse the classification and mapping of EOF with
> >      the classification and mapping of some non-EOF inputs.
> > 
> > 
> > So, although no warning is emmited on linux, it looks like to me that the
> > cast to unsigned char is needed anyway, and relying on glibc's behavior
> > is not portable.
> 
> I fully agree we shouldn't rely on glibc's behavior (I'm sure
> you've looked at xen/include/xen/ctype.h to see how we address
> this it Xen itself, but I will admit this is to a degree comparing
> apples and oranges, not the least because the lack of a need to
> consider EOF in Xen). At least in xen/tools/symbols.c I don't
> think we're at risk of running into "undefined" space. Casts are

as long as there's only ascii characters.

Anyway NetBSD won't change its ctype.h
Andrew Cooper Jan. 18, 2021, 7:08 p.m. UTC | #5
Hello,

I've committed some of the trivial and reviewed patches.

I think all others have outstanding questions, or really need the
identified maintainers to comment.

~Andrew
Manuel Bouyer Jan. 18, 2021, 7:11 p.m. UTC | #6
On Mon, Jan 18, 2021 at 07:08:58PM +0000, Andrew Cooper wrote:
> Hello,
> 
> I've committed some of the trivial and reviewed patches.
> 
> I think all others have outstanding questions, or really need the
> identified maintainers to comment.

thanks. I will try to adress the questions later this week.
Manuel Bouyer Jan. 26, 2021, 5:44 p.m. UTC | #7
On Thu, Jan 14, 2021 at 03:16:15PM +0100, Manuel Bouyer wrote:
> On Thu, Jan 14, 2021 at 02:25:05PM +0100, Jan Beulich wrote:
> > On 14.01.2021 13:29, Manuel Bouyer wrote:
> > > On Thu, Jan 14, 2021 at 11:53:20AM +0100, Jan Beulich wrote:
> > >> On 12.01.2021 19:12, Manuel Bouyer wrote:
> > >>> From: Manuel Bouyer <bouyer@netbsd.org>
> > >>>
> > >>> Use unsigned char variable, or cast to (unsigned char), for
> > >>> tolower()/islower() and friends. Fix compiler error
> > >>> array subscript has type 'char' [-Werror=char-subscripts]
> > >>
> > >> Isn't this something that wants changing in your ctype.h instead?
> > >> the functions (or macros), as per the C standard, ought to accept
> > >> plain char aiui, and if they use the input as an array subscript,
> > >> it should be their implementation suitably converting type first.
> > > 
> > > I asked for inputs from NetBSD developers familiar with this part.
> > > 
> > > Although the parameter is an int, only a subset of values is valid,
> > > as stated in ISO C 2018 (Section 7.4 paragrah 1):
> > >> In all cases the argument is an int, the value of which shall be
> > >> representable as an unsigned char or shall equal the value of the
> > >> macro EOF.  If the argument has any other value, the behavior is 
> > >> undefined.                               
> > 
> > Which means you're shifting the undefined-ness from the implementation
> > (using the value as array index) to the callers (truncating values, or
> > converting value range). In particular I do not think that the
> > intention behind the standard's wording is for every caller to need to
> > cast to unsigned char, when e.g. character literals are of type char
> > and string literals are of type const char[].
> 
> If you don't cast you fall into the undefined behavior case for non-ascii
> characters. For example, "é" in iso-latin-1 is 0xe9. In a signed char, this is
> -23 (decimal). without the cast, tolower() will see -23.
> If it is casted to (unsigned char) first, tolower() will see 233, as expected.
> The following test program illustrates this:
> armandeche:/tmp>cat toto.c
> #include <stdio.h>
> 
> int
> main(int _c, const char *_v[])
> {
>         char c = 0xe9;
> 	printf("%d %d\n", (int)c, (int)(unsigned char)c);
> }
> armandeche:/tmp>./toto 
> -23 233
> 
> 
> 
> > 
> > > As stated by NetBSD's ctype(3) manual page, NetBSD and glibc took different
> > > approach. NetBSD emits a compile-time warning if the input may lead to
> > > undefined behavior. quoting the man page:
> > >      Some implementations of libc, such as glibc as of 2018, attempt to avoid
> > >      the worst of the undefined behavior by defining the functions to work for
> > >      all integer inputs representable by either unsigned char or char, and
> > >      suppress the warning.  However, this is not an excuse for avoiding
> > >      conversion to unsigned char: if EOF coincides with any such value, as it
> > >      does when it is -1 on platforms with signed char, programs that pass char
> > >      will still necessarily confuse the classification and mapping of EOF with
> > >      the classification and mapping of some non-EOF inputs.
> > > 
> > > 
> > > So, although no warning is emmited on linux, it looks like to me that the
> > > cast to unsigned char is needed anyway, and relying on glibc's behavior
> > > is not portable.
> > 
> > I fully agree we shouldn't rely on glibc's behavior (I'm sure
> > you've looked at xen/include/xen/ctype.h to see how we address
> > this it Xen itself, but I will admit this is to a degree comparing
> > apples and oranges, not the least because the lack of a need to
> > consider EOF in Xen). At least in xen/tools/symbols.c I don't
> > think we're at risk of running into "undefined" space. Casts are
> 
> as long as there's only ascii characters.
> 
> Anyway NetBSD won't change its ctype.h

I guess I'm going to give up on this one. We'll keep it as a local patch.
Ian Jackson Jan. 26, 2021, 5:59 p.m. UTC | #8
Manuel Bouyer writes ("Re: [PATCH] Fix error: array subscript has type 'char'"):
> On Thu, Jan 14, 2021 at 03:16:15PM +0100, Manuel Bouyer wrote:
> > On Thu, Jan 14, 2021 at 02:25:05PM +0100, Jan Beulich wrote:
> > > Which means you're shifting the undefined-ness from the implementation

The undefined-ness is in the *specification*. [1]

> > > Isn't this something that wants changing in your ctype.h instead?
> > > the functions (or macros), as per the C standard

Jan, can you please check your facts before asserting the existence
of a pretty blatant bug in a platform's implementation of basic C
functions ?

From my copy of C99+TC1+TC2, para 7.4:

 1   In all cases the argument is an int, the value of which shall be
     representable as an unsigned char or shall equal the value of the
     macro EOF.  If the argument has any other value, the behavior is
     undefined. [...]

If char is signed, then it can contain -ve values.  Those values are
promoted to int by the usual integer promotions.  Naturally such
negative values are not representable as unsigned char.  Passing them
to ctype macros is UB.

So Manuel's ctypes.h is conforming and the compiler warning (which I
have seen on Linux too) is correct.

> I guess I'm going to give up on this one. We'll keep it as a local patch.

Manuel, your original patch:

Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Release-Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

(The R-A is FTAOD since IMO this is a bugfix.)

Ian.

[1] I agree that "fixing" ctypes.h everywhere might be nice (eg the
way glibc does it) but that has other implications and it is not
reasonable to expect platforms supporting Xen to do that.
Jan Beulich Jan. 27, 2021, 8:31 a.m. UTC | #9
On 26.01.2021 18:59, Ian Jackson wrote:
> Manuel Bouyer writes ("Re: [PATCH] Fix error: array subscript has type 'char'"):
>> On Thu, Jan 14, 2021 at 03:16:15PM +0100, Manuel Bouyer wrote:
>>> On Thu, Jan 14, 2021 at 02:25:05PM +0100, Jan Beulich wrote:
>>>> Which means you're shifting the undefined-ness from the implementation
> 
> The undefined-ness is in the *specification*. [1]
> 
>>>> Isn't this something that wants changing in your ctype.h instead?
>>>> the functions (or macros), as per the C standard
> 
> Jan, can you please check your facts before asserting the existence
> of a pretty blatant bug in a platform's implementation of basic C
> functions ?
> 
> From my copy of C99+TC1+TC2, para 7.4:
> 
>  1   In all cases the argument is an int, the value of which shall be
>      representable as an unsigned char or shall equal the value of the
>      macro EOF.  If the argument has any other value, the behavior is
>      undefined. [...]
> 
> If char is signed, then it can contain -ve values.  Those values are
> promoted to int by the usual integer promotions.  Naturally such
> negative values are not representable as unsigned char.  Passing them
> to ctype macros is UB.

I did read that part of the spec before replying. Irrespective
of the wording there it seems entirely unreasonable to me for
the spec to imply all use sites of the is...() functions to
have to use casts. Even more so that we all know (I suppose)
that casts can be dangerous as both potentially introducing
bugs (perhaps not at the point of their addition, but later
when code elsewhere gets changed) and keeping analysis tools
from actually spotting ones.

But yes, I'm not the maintainer of this code, so if you're
happy with such risks, so be it.

For the record, to me the less risky approach here would seem
to have been to make use of compilers allowing to choose
whether plain char is signed or unsigned, and force it to
unsigned for at least the affected components.

Jan
Jan Beulich Jan. 27, 2021, 8:37 a.m. UTC | #10
On 27.01.2021 09:31, Jan Beulich wrote:
> But yes, I'm not the maintainer of this code, so if you're
> happy with such risks, so be it.

And actually I was only partly right here - there's one hunk
here affecting code I'm the maintainer just as much as you
are. At least there I'd like to ask that ...

> For the record, to me the less risky approach here would seem
> to have been to make use of compilers allowing to choose
> whether plain char is signed or unsigned, and force it to
> unsigned for at least the affected components.

... this be considered as an alternative, before I maybe
withdraw my de-facto nak.

(To amend my earlier reply: Such a command line option
addition could then also be properly commented, to explain
the particular need for the option.)

Jan
Ian Jackson Jan. 27, 2021, 1:53 p.m. UTC | #11
Jan Beulich writes ("Re: [PATCH] Fix error: array subscript has type 'char'"):
> I did read that part of the spec before replying.

I find this quite astonishing.  You claimed that FreeBSD's libc was
buggy *after having read the spec to which you agree it conforms*.

>   Irrespective of the wording there it seems entirely unreasonable
> to me for the spec to imply all use sites of the is...() functions
> to have to use casts. Even more so that we all know (I suppose) that
> casts can be dangerous as both potentially introducing bugs (perhaps
> not at the point of their addition, but later when code elsewhere
> gets changed) and keeping analysis tools from actually spotting
> ones.

Nevertheless, this is the design of the C standard.  A common approach
to this problem is something like this (from libxl_internal.h):

  /*
   * int CTYPE(ISFOO, char c);
   * int CTYPE(toupper, char c);
   * int CTYPE(tolower, char c);
   *
   * This is necessary because passing a simple char to a ctype.h
   * is forbidden.  ctype.h macros take ints derived from _unsigned_ chars.
   *
   * If you have a char which might be EOF then you should already have
   * it in an int representing an unsigned char, and you can use the
   * <ctype.h> macros directly.  This generally happens only with values
   * from fgetc et al.
   *
   * For any value known to be a character (eg, anything that came from
   * a char[]), use CTYPE.
   */
  #define CTYPE(isfoo,c) (isfoo((unsigned char)(c)))

Jan Beulich writes ("Re: [PATCH] Fix error: array subscript has type 'char'"):
> On 27.01.2021 09:31, Jan Beulich wrote:
> > But yes, I'm not the maintainer of this code, so if you're
> > happy with such risks, so be it.
> 
> And actually I was only partly right here - there's one hunk
> here affecting code I'm the maintainer just as much as you
> are. At least there I'd like to ask that ...
> 
> > For the record, to me the less risky approach here would seem
> > to have been to make use of compilers allowing to choose
> > whether plain char is signed or unsigned, and force it to
> > unsigned for at least the affected components.
> 
> ... this be considered as an alternative, before I maybe
> withdraw my de-facto nak.

Whether char is signed or unsigned is generally specified in the
platform API/ABI.  Deviating from this for userland code is not
possible or reasonable since it would involve processing the system
headers with a deviant langauge definition!

Deviating from it for hypervisor code would be possible but I think it
would be unwise.

Ian.
Jan Beulich Jan. 27, 2021, 2:33 p.m. UTC | #12
On 27.01.2021 14:53, Ian Jackson wrote:
> Jan Beulich writes ("Re: [PATCH] Fix error: array subscript has type 'char'"):
>> On 27.01.2021 09:31, Jan Beulich wrote:
>>> But yes, I'm not the maintainer of this code, so if you're
>>> happy with such risks, so be it.
>>
>> And actually I was only partly right here - there's one hunk
>> here affecting code I'm the maintainer just as much as you
>> are. At least there I'd like to ask that ...
>>
>>> For the record, to me the less risky approach here would seem
>>> to have been to make use of compilers allowing to choose
>>> whether plain char is signed or unsigned, and force it to
>>> unsigned for at least the affected components.
>>
>> ... this be considered as an alternative, before I maybe
>> withdraw my de-facto nak.
> 
> Whether char is signed or unsigned is generally specified in the
> platform API/ABI.  Deviating from this for userland code is not
> possible or reasonable since it would involve processing the system
> headers with a deviant langauge definition!

I don't think I've ever come across that part of a platform
API/ABI spec. Instead my understanding so far was that good
platform headers would be ignorant of the user's choice of
char's signed-ness (wherever compilers offer such a choice,
but I think all that I've ever worked with did). At the very
least gcc's doc doesn't warn about any possible
incompatibilities resulting from use of -fsigned-char or
-funsigned-char (or their bitfield equivalents, for that
matter).

Jan
Ian Jackson Jan. 27, 2021, 4:21 p.m. UTC | #13
Jan Beulich writes ("Re: [PATCH] Fix error: array subscript has type 'char' [and 1 more messages]"):
> I don't think I've ever come across that part of a platform
> API/ABI spec. Instead my understanding so far was that good
> platform headers would be ignorant of the user's choice of
> char's signed-ness (wherever compilers offer such a choice,
> but I think all that I've ever worked with did). At the very
> least gcc's doc doesn't warn about any possible
> incompatibilities resulting from use of -fsigned-char or
> -funsigned-char (or their bitfield equivalents, for that
> matter).

Well, I've considered this and I still don't think changing to
-funsigned-char is good idea.

Are you OK with me checking in the current patch or should I ask the
other committers for a second opinion ?

Thanks,
Ian.
Jan Beulich Jan. 27, 2021, 4:32 p.m. UTC | #14
On 27.01.2021 17:21, Ian Jackson wrote:
> Jan Beulich writes ("Re: [PATCH] Fix error: array subscript has type 'char' [and 1 more messages]"):
>> I don't think I've ever come across that part of a platform
>> API/ABI spec. Instead my understanding so far was that good
>> platform headers would be ignorant of the user's choice of
>> char's signed-ness (wherever compilers offer such a choice,
>> but I think all that I've ever worked with did). At the very
>> least gcc's doc doesn't warn about any possible
>> incompatibilities resulting from use of -fsigned-char or
>> -funsigned-char (or their bitfield equivalents, for that
>> matter).
> 
> Well, I've considered this and I still don't think changing to
> -funsigned-char is good idea.
> 
> Are you OK with me checking in the current patch or should I ask the
> other committers for a second opinion ?

For the changes to tools/ it's really up to you. For the change
to xen/tools/symbols.c I could live with it (for being user
space code), but I still think adding casts in such a place is
not necessarily setting a good precedent. So for this one I'd
indeed appreciate getting another opinion.

Jan
George Dunlap Jan. 27, 2021, 4:52 p.m. UTC | #15
> On Jan 27, 2021, at 4:32 PM, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 27.01.2021 17:21, Ian Jackson wrote:
>> Jan Beulich writes ("Re: [PATCH] Fix error: array subscript has type 'char' [and 1 more messages]"):
>>> I don't think I've ever come across that part of a platform
>>> API/ABI spec. Instead my understanding so far was that good
>>> platform headers would be ignorant of the user's choice of
>>> char's signed-ness (wherever compilers offer such a choice,
>>> but I think all that I've ever worked with did). At the very
>>> least gcc's doc doesn't warn about any possible
>>> incompatibilities resulting from use of -fsigned-char or
>>> -funsigned-char (or their bitfield equivalents, for that
>>> matter).
>> 
>> Well, I've considered this and I still don't think changing to
>> -funsigned-char is good idea.
>> 
>> Are you OK with me checking in the current patch or should I ask the
>> other committers for a second opinion ?
> 
> For the changes to tools/ it's really up to you. For the change
> to xen/tools/symbols.c I could live with it (for being user
> space code), but I still think adding casts in such a place is
> not necessarily setting a good precedent. So for this one I'd
> indeed appreciate getting another opinion.

My thoughts:

* On the whole, the risk of an incompatibility with system headers does seem higher than the risk of casting a value which is known not to be EOF

* Such a change doesn’t seem like the kind of thing we should ask Manuel to do, when a simpler change will do the trick

* At any rate it doesn’t seem like a good thing to experiment with in the week before the feature freeze.

 -George
Jan Beulich Jan. 27, 2021, 5 p.m. UTC | #16
On 27.01.2021 17:52, George Dunlap wrote:
> 
> 
>> On Jan 27, 2021, at 4:32 PM, Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 27.01.2021 17:21, Ian Jackson wrote:
>>> Jan Beulich writes ("Re: [PATCH] Fix error: array subscript has type 'char' [and 1 more messages]"):
>>>> I don't think I've ever come across that part of a platform
>>>> API/ABI spec. Instead my understanding so far was that good
>>>> platform headers would be ignorant of the user's choice of
>>>> char's signed-ness (wherever compilers offer such a choice,
>>>> but I think all that I've ever worked with did). At the very
>>>> least gcc's doc doesn't warn about any possible
>>>> incompatibilities resulting from use of -fsigned-char or
>>>> -funsigned-char (or their bitfield equivalents, for that
>>>> matter).
>>>
>>> Well, I've considered this and I still don't think changing to
>>> -funsigned-char is good idea.
>>>
>>> Are you OK with me checking in the current patch or should I ask the
>>> other committers for a second opinion ?
>>
>> For the changes to tools/ it's really up to you. For the change
>> to xen/tools/symbols.c I could live with it (for being user
>> space code), but I still think adding casts in such a place is
>> not necessarily setting a good precedent. So for this one I'd
>> indeed appreciate getting another opinion.
> 
> My thoughts:
> 
> * On the whole, the risk of an incompatibility with system headers does seem higher than the risk of casting a value which is known not to be EOF
> 
> * Such a change doesn’t seem like the kind of thing we should ask Manuel to do, when a simpler change will do the trick
> 
> * At any rate it doesn’t seem like a good thing to experiment with in the week before the feature freeze.

Well, okay then, I withdraw my implied nak under these conditions.

Jan
Ian Jackson Jan. 27, 2021, 5 p.m. UTC | #17
George Dunlap writes ("Re: [PATCH] Fix error: array subscript has type 'char' [and 1 more messages]"):
> > On Jan 27, 2021, at 4:32 PM, Jan Beulich <jbeulich@suse.com> wrote:
> > On 27.01.2021 17:21, Ian Jackson wrote:
> >> Are you OK with me checking in the current patch or should I ask the
> >> other committers for a second opinion ?
> > 
> > For the changes to tools/ it's really up to you. For the change
> > to xen/tools/symbols.c I could live with it (for being user
> > space code), but I still think adding casts in such a place is
> > not necessarily setting a good precedent. So for this one I'd
> > indeed appreciate getting another opinion.
> 
> My thoughts:
> 
> * On the whole, the risk of an incompatibility with system headers does seem higher than the risk of casting a value which is known not to be EOF
> 
> * Such a change doesn’t seem like the kind of thing we should ask Manuel to do, when a simpler change will do the trick
> 
> * At any rate it doesn’t seem like a good thing to experiment with in the week before the feature freeze.

Thanks, George.  Given the release timing, I intend to commit Manuel's
patch unless I hear a further contrary opinion before 1700 UTC
tomorrow.

NB that with my RM hat on I consider this a bugfix so it would not
need a freeze exception to be committed next week but obviously with
my RM hat on I would prefer no to delay it.

Ian.
diff mbox series

Patch

diff --git a/tools/libs/light/libxl_qmp.c b/tools/libs/light/libxl_qmp.c
index c394000ea9..9b638e6f54 100644
--- a/tools/libs/light/libxl_qmp.c
+++ b/tools/libs/light/libxl_qmp.c
@@ -1249,7 +1249,7 @@  static int qmp_error_class_to_libxl_error_code(libxl__gc *gc,
                 se++;
                 continue;
             }
-            if (tolower(*s) != tolower(*se))
+            if (tolower((unsigned char)*s) != tolower((unsigned char)*se))
                 break;
             s++, se++;
         }
diff --git a/tools/xentrace/xentrace.c b/tools/xentrace/xentrace.c
index 4b50b8a53e..a8903ebf46 100644
--- a/tools/xentrace/xentrace.c
+++ b/tools/xentrace/xentrace.c
@@ -957,7 +957,7 @@  static int parse_cpumask_range(const char *mask_str, xc_cpumap_t map)
 {
     unsigned int a, b;
     int nmaskbits;
-    char c;
+    unsigned char c;
     int in_range;
     const char *s;
 
diff --git a/xen/tools/symbols.c b/xen/tools/symbols.c
index 9f9e2c9900..0b12452616 100644
--- a/xen/tools/symbols.c
+++ b/xen/tools/symbols.c
@@ -173,11 +173,11 @@  static int read_symbol(FILE *in, struct sym_entry *s)
 	/* include the type field in the symbol name, so that it gets
 	 * compressed together */
 	s->len = strlen(str) + 1;
-	if (islower(stype) && filename)
+	if (islower((unsigned char)stype) && filename)
 		s->len += strlen(filename) + 1;
 	s->sym = malloc(s->len + 1);
 	sym = SYMBOL_NAME(s);
-	if (islower(stype) && filename) {
+	if (islower((unsigned char)stype) && filename) {
 		sym = stpcpy(sym, filename);
 		*sym++ = '#';
 	}