Message ID | 20220607125441.36757-1-lucvoo@kernel.org (mailing list archive) |
---|---|
State | Mainlined, archived |
Headers | show |
Series | allow show_token() on TOKEN_ZERO_IDENT | expand |
On Tue, Jun 7, 2022 at 5:55 AM Luc Van Oostenryck <lucvoo@kernel.org> wrote: > > TOKEN_ZERO_IDENTs are created during the evaluation of pre-processor > expressions but which otherwise are normal idents and were first tokenized > as TOKEN_IDENTs. > > As such, they could perfectly be displayed by show_token() but are not. > So, in error messages they are displayed as "unhandled token type '4'", > which is not at all informative. > > Fix this by letting show_token() process them like usual TOKEN_IDENTs. > Idem for quote_token(). Ack. I do wonder if it should be marked somehow as being that special case. The main reason for 'show_token()' is debugging, after all, and TOKEN_ZERO_IDENT does have magical properties in how it either silently expands to the constant '0', or it generates a warning about undefined preprocessor symbol. But considering that we've apparently reported it as "unhandled token type '4'" since 2005, I guess it's not exactly a big deal. Linus
On Tue, Jun 07, 2022 at 11:26:36AM -0700, Linus Torvalds wrote: > On Tue, Jun 7, 2022 at 5:55 AM Luc Van Oostenryck <lucvoo@kernel.org> wrote: > > > > TOKEN_ZERO_IDENTs are created during the evaluation of pre-processor > > expressions but which otherwise are normal idents and were first tokenized > > as TOKEN_IDENTs. > > > > As such, they could perfectly be displayed by show_token() but are not. > > So, in error messages they are displayed as "unhandled token type '4'", > > which is not at all informative. > > > > Fix this by letting show_token() process them like usual TOKEN_IDENTs. > > Idem for quote_token(). > > Ack. > > I do wonder if it should be marked somehow as being that special case. > The main reason for 'show_token()' is debugging, after all, and > TOKEN_ZERO_IDENT does have magical properties in how it either > silently expands to the constant '0', or it generates a warning about > undefined preprocessor symbol. > > But considering that we've apparently reported it as "unhandled token > type '4'" since 2005, I guess it's not exactly a big deal. Yes, I first thought to do so but then choose not because I could not convince myself that its special property was irrelevant in warning/error messages. It looks to me more as an internal thing, more semantical than lexical, and a non-faithful representation would be confusing in messages. For context, the input text I had (from GCC's testsuite) was: #define empty #if empty#cpu(m68k) #endif and the error message sparse issued was: error: garbage at end: #unhandled token type '4' (unhandled token type '4' ) with this patch it's: error: garbage at end: #cpu(m68k) -- Luc
diff --git a/tokenize.c b/tokenize.c index ea7105438270..fdaea370cc48 100644 --- a/tokenize.c +++ b/tokenize.c @@ -201,6 +201,7 @@ const char *show_token(const struct token *token) return "end-of-input"; case TOKEN_IDENT: + case TOKEN_ZERO_IDENT: return show_ident(token->ident); case TOKEN_NUMBER: @@ -259,6 +260,7 @@ const char *quote_token(const struct token *token) return "syntax error"; case TOKEN_IDENT: + case TOKEN_ZERO_IDENT: return show_ident(token->ident); case TOKEN_NUMBER: