diff mbox series

[RFC] xen: Remove -Wdeclaration-after-statement

Message ID 20240809130418.10431-1-alejandro.vallejo@cloud.com (mailing list archive)
State New
Headers show
Series [RFC] xen: Remove -Wdeclaration-after-statement | expand

Commit Message

Alejandro Vallejo Aug. 9, 2024, 1:04 p.m. UTC
This warning only makes sense when developing using a compiler with C99
support on a codebase meant to be built with C89 compilers too, and
that's no longer the case (nor should it be, as it's been 25 years since
C99 came out already).

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
Yes, I'm opening this can of worms. I'd like to hear others people's
thoughts on this and whether this is something MISRA has views on. If
there's an ulterior non-obvious reason besides stylistic preference I
think it should be documented somewhere, but I haven't seen such an
explanation.

IMO, the presence of this warning causes several undesirable effects:

  1. Small functions are hampered by the preclusion of check+declare
     patterns that improve readability via concision. e.g: Consider a
     silly example like:

     /* with warning */                     /* without warning */
     void foo(uint8_t *p)                   void foo(uint8_t *p)
     {                                      {
         uint8_t  tmp1;                         if ( !p )
         uint16_t tmp2;                             return;
         uint32_t tmp3;
                                                uint8_t  tmp1 = OFFSET1 + *p;
         if ( !p )                              uint16_t tmp2 = OFFSET2 + *p;
             return;                            uint32_t tmp3 = OFFSET3 + *p;

         tmp1 = OFFSET1 + *p;                   /* Lots of uses of `tmpX` */
         tmp2 = OFFSET2 + *p;               }
         tmp2 = OFFSET2 + *p;

         /* Lots of uses of tmpX */
     }

  2. Promotes scope-creep. On small functions it doesn't matter much,
     but on bigger ones to prevent declaring halfway through the body
     needlessly increases variable scope to the full scope in which they
     are defined rather than the subscope of point-of-declaration to
     end-of-current-scope. In cases in which they can be trivially
     defined at that point, it also means they can be trivially misused
     before they are meant to. i.e: On the example in (1) assume the
     conditional in "with warning" is actually a large switch statement.

  3. It facilitates a disconnect between time-of-declaration and
     time-of-use that lead very easily to "use-before-init" bugs.
     While a modern compiler can alleviate the most egregious cases of
     this, there's cases it simply cannot cover. A conditional
     initialization on anything with external linkage combined with a
     conditional use on something else with external linkage will squash
     the warning of using an uninitialised variable. Things are worse
     where the variable in question is preinitialised to something
     credible (e.g: a pointer to NULL), as then that can be misused
     between its declaration and its original point of intended use.

So... thoughts? yay or nay?
---
 xen/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jan Beulich Aug. 9, 2024, 1:13 p.m. UTC | #1
On 09.08.2024 15:04, Alejandro Vallejo wrote:
> This warning only makes sense when developing using a compiler with C99
> support on a codebase meant to be built with C89 compilers too, and
> that's no longer the case (nor should it be, as it's been 25 years since
> C99 came out already).
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> Yes, I'm opening this can of worms. I'd like to hear others people's
> thoughts on this and whether this is something MISRA has views on. If
> there's an ulterior non-obvious reason besides stylistic preference I
> think it should be documented somewhere, but I haven't seen such an
> explanation.
> 
> IMO, the presence of this warning causes several undesirable effects:
> 
>   1. Small functions are hampered by the preclusion of check+declare
>      patterns that improve readability via concision. e.g: Consider a
>      silly example like:
> 
>      /* with warning */                     /* without warning */
>      void foo(uint8_t *p)                   void foo(uint8_t *p)
>      {                                      {
>          uint8_t  tmp1;                         if ( !p )
>          uint16_t tmp2;                             return;
>          uint32_t tmp3;
>                                                 uint8_t  tmp1 = OFFSET1 + *p;
>          if ( !p )                              uint16_t tmp2 = OFFSET2 + *p;
>              return;                            uint32_t tmp3 = OFFSET3 + *p;
> 
>          tmp1 = OFFSET1 + *p;                   /* Lots of uses of `tmpX` */
>          tmp2 = OFFSET2 + *p;               }
>          tmp2 = OFFSET2 + *p;
> 
>          /* Lots of uses of tmpX */
>      }
> 
>   2. Promotes scope-creep. On small functions it doesn't matter much,
>      but on bigger ones to prevent declaring halfway through the body
>      needlessly increases variable scope to the full scope in which they
>      are defined rather than the subscope of point-of-declaration to
>      end-of-current-scope. In cases in which they can be trivially
>      defined at that point, it also means they can be trivially misused
>      before they are meant to. i.e: On the example in (1) assume the
>      conditional in "with warning" is actually a large switch statement.
> 
>   3. It facilitates a disconnect between time-of-declaration and
>      time-of-use that lead very easily to "use-before-init" bugs.
>      While a modern compiler can alleviate the most egregious cases of
>      this, there's cases it simply cannot cover. A conditional
>      initialization on anything with external linkage combined with a
>      conditional use on something else with external linkage will squash
>      the warning of using an uninitialised variable. Things are worse
>      where the variable in question is preinitialised to something
>      credible (e.g: a pointer to NULL), as then that can be misused
>      between its declaration and its original point of intended use.

Right, these are the aspects that would improve. The potential downside is
that you no longer have a fixed place (set of places) where to look for
which variables are actually in scope. For people having worked with C89
(and not e.g. C++) for a very long time, mixing of declarations and
statements may be irritating. In fact, having used C++ quite a lot in the
(meanwhile distant) past, I have developed a mental C mode and a mental
C++ one - when in the former I expect declarations at the start of scopes,
while when in the latter I know to expect them everywhere.

All in all - I'm afraid I'm split on this.

Jan
Stefano Stabellini Aug. 9, 2024, 7:25 p.m. UTC | #2
Adding Roberto

Does MISRA have a view on this? I seem to remember this is discouraged?


On Fri, 9 Aug 2024, Alejandro Vallejo wrote:
> This warning only makes sense when developing using a compiler with C99
> support on a codebase meant to be built with C89 compilers too, and
> that's no longer the case (nor should it be, as it's been 25 years since
> C99 came out already).
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> Yes, I'm opening this can of worms. I'd like to hear others people's
> thoughts on this and whether this is something MISRA has views on. If
> there's an ulterior non-obvious reason besides stylistic preference I
> think it should be documented somewhere, but I haven't seen such an
> explanation.
> 
> IMO, the presence of this warning causes several undesirable effects:
> 
>   1. Small functions are hampered by the preclusion of check+declare
>      patterns that improve readability via concision. e.g: Consider a
>      silly example like:
> 
>      /* with warning */                     /* without warning */
>      void foo(uint8_t *p)                   void foo(uint8_t *p)
>      {                                      {
>          uint8_t  tmp1;                         if ( !p )
>          uint16_t tmp2;                             return;
>          uint32_t tmp3;
>                                                 uint8_t  tmp1 = OFFSET1 + *p;
>          if ( !p )                              uint16_t tmp2 = OFFSET2 + *p;
>              return;                            uint32_t tmp3 = OFFSET3 + *p;
> 
>          tmp1 = OFFSET1 + *p;                   /* Lots of uses of `tmpX` */
>          tmp2 = OFFSET2 + *p;               }
>          tmp2 = OFFSET2 + *p;
> 
>          /* Lots of uses of tmpX */
>      }
> 
>   2. Promotes scope-creep. On small functions it doesn't matter much,
>      but on bigger ones to prevent declaring halfway through the body
>      needlessly increases variable scope to the full scope in which they
>      are defined rather than the subscope of point-of-declaration to
>      end-of-current-scope. In cases in which they can be trivially
>      defined at that point, it also means they can be trivially misused
>      before they are meant to. i.e: On the example in (1) assume the
>      conditional in "with warning" is actually a large switch statement.
> 
>   3. It facilitates a disconnect between time-of-declaration and
>      time-of-use that lead very easily to "use-before-init" bugs.
>      While a modern compiler can alleviate the most egregious cases of
>      this, there's cases it simply cannot cover. A conditional
>      initialization on anything with external linkage combined with a
>      conditional use on something else with external linkage will squash
>      the warning of using an uninitialised variable. Things are worse
>      where the variable in question is preinitialised to something
>      credible (e.g: a pointer to NULL), as then that can be misused
>      between its declaration and its original point of intended use.
> 
> So... thoughts? yay or nay?

In my opinion, there are some instances where mixing declarations and
statements would enhance the code, but these are uncommon. Given the
choice between:

1) declarations always first
2) declarations always mixed with statements

I would choose 1).

One could say that mixing declarations with statements should be done
only when appropriate, but the challenge lies in the subjective nature
of "when appropriate." Different individuals have varying
interpretations of this, which could lead to unnecessary bikeshedding.

For these reasons, I do not support mixing declarations and statements.
Alejandro Vallejo Aug. 12, 2024, 1:20 p.m. UTC | #3
On Fri Aug 9, 2024 at 8:25 PM BST, Stefano Stabellini wrote:
> Adding Roberto
>
> Does MISRA have a view on this? I seem to remember this is discouraged?
>

I'd be surprised if MISRA didn't promote declaring close to first use to avoid
use-before-init, but you very clearly have a lot more exposure to it than I do.

I'm quite curious about what its preference is and the rationale for it.

>
> On Fri, 9 Aug 2024, Alejandro Vallejo wrote:
> > This warning only makes sense when developing using a compiler with C99
> > support on a codebase meant to be built with C89 compilers too, and
> > that's no longer the case (nor should it be, as it's been 25 years since
> > C99 came out already).
> > 
> > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> > ---
> > Yes, I'm opening this can of worms. I'd like to hear others people's
> > thoughts on this and whether this is something MISRA has views on. If
> > there's an ulterior non-obvious reason besides stylistic preference I
> > think it should be documented somewhere, but I haven't seen such an
> > explanation.
> > 
> > IMO, the presence of this warning causes several undesirable effects:
> > 
> >   1. Small functions are hampered by the preclusion of check+declare
> >      patterns that improve readability via concision. e.g: Consider a
> >      silly example like:
> > 
> >      /* with warning */                     /* without warning */
> >      void foo(uint8_t *p)                   void foo(uint8_t *p)
> >      {                                      {
> >          uint8_t  tmp1;                         if ( !p )
> >          uint16_t tmp2;                             return;
> >          uint32_t tmp3;
> >                                                 uint8_t  tmp1 = OFFSET1 + *p;
> >          if ( !p )                              uint16_t tmp2 = OFFSET2 + *p;
> >              return;                            uint32_t tmp3 = OFFSET3 + *p;
> > 
> >          tmp1 = OFFSET1 + *p;                   /* Lots of uses of `tmpX` */
> >          tmp2 = OFFSET2 + *p;               }
> >          tmp2 = OFFSET2 + *p;
> > 
> >          /* Lots of uses of tmpX */
> >      }
> > 
> >   2. Promotes scope-creep. On small functions it doesn't matter much,
> >      but on bigger ones to prevent declaring halfway through the body
> >      needlessly increases variable scope to the full scope in which they
> >      are defined rather than the subscope of point-of-declaration to
> >      end-of-current-scope. In cases in which they can be trivially
> >      defined at that point, it also means they can be trivially misused
> >      before they are meant to. i.e: On the example in (1) assume the
> >      conditional in "with warning" is actually a large switch statement.
> > 
> >   3. It facilitates a disconnect between time-of-declaration and
> >      time-of-use that lead very easily to "use-before-init" bugs.
> >      While a modern compiler can alleviate the most egregious cases of
> >      this, there's cases it simply cannot cover. A conditional
> >      initialization on anything with external linkage combined with a
> >      conditional use on something else with external linkage will squash
> >      the warning of using an uninitialised variable. Things are worse
> >      where the variable in question is preinitialised to something
> >      credible (e.g: a pointer to NULL), as then that can be misused
> >      between its declaration and its original point of intended use.
> > 
> > So... thoughts? yay or nay?
>
> In my opinion, there are some instances where mixing declarations and
> statements would enhance the code, but these are uncommon. Given the
> choice between:
>
> 1) declarations always first
> 2) declarations always mixed with statements
>
> I would choose 1).

FWIW, so would I under those contraints. But let me at least try to persuade
you. There's at least two more arguments to weigh:

  1. It wasn't that long ago that we had to resort to GNU extensions to work
     around this restriction. It's mildly annoying having to play games with
     compiler extensions because we're purposely restricting our use of the
     language.

     See the clang codegen workaround:
        https://lore.kernel.org/xen-devel/D2ZM0D609TOQ.2GQQWR1QALUPA@cloud.com/

  2. There's an existing divide between toolstack and hypervisor. Toolstack
     already allows this kind of mixing, and it's hard not-to due to external
     dependencies. While style doesn't have to match 1:1, it'd be (imo)
     preferrable to try and remove avoidable differences.

     See (40be6307ec00: "Only compile the hypervisor with -Wdeclaration-after-statement")
         https://lore.kernel.org/xen-devel/20231205183226.26636-1-julien@xen.org/

With this in mind, ...

>
> One could say that mixing declarations with statements should be done
> only when appropriate, but the challenge lies in the subjective nature
> of "when appropriate." Different individuals have varying
> interpretations of this, which could lead to unnecessary bikeshedding.

... I do agree that whatever changes we introduce, considering the usual
complaints about the review process, should be in a direction of measurable
objectivity so as to minimize unproductive nitpicking.

In that sense, a differently worded choice is:

  1. Declarations always at the beginning of the scope closest to first use.
  2. Declarations always closest to first use.

They really aren't that different, and we do already make reviews asking for
declarations to be moved to the closest scope. Given this choice I definitely
prefer (2), in part because it removes the uncertainty from review that true
freeform declarations would allow, better aligns our dialect of C with the
current standard and aligns the hypervisor style (slightly) with the tools it
requires.

Plus the advantages I already mentioned in the original email.

>
> For these reasons, I do not support mixing declarations and statements.

Pending whatever MISRA has to say on the matter, I hope these arguments can
steer your view.

Cheers,
Alejandrq
Roberto Bagnara Aug. 12, 2024, 3:27 p.m. UTC | #4
On 09/08/24 21:25, Stefano Stabellini wrote:
> Adding Roberto
> 
> Does MISRA have a view on this? I seem to remember this is discouraged?

As far as I know, there is nothing in MISRA C against or in favor of
mixing declaration with statements.  The only (slightly) relevant
guideline is advisory Rule 8.9 (An object should be declared at block
scope if its identifier only appears in a single function), which advises
to declare objects at function scope when this is possible.
The rationale of the same rule says, among other things:

   Within a function, whether objects are declared at the outermost
   or innermost block is largely a matter of style.

On the other hand, MISRA C recognizes that reduction of scope
is, all other things being equal, to be preferred, but there
are no guidelines similar to  -Wdeclaration-after-statement.
Rather, the point is "declare it at block scope, if you can;
otherwise declare it at file scope, if you can; otherwise,
declare it at global scope, if you must."
Kind regards,

    Roberto

> On Fri, 9 Aug 2024, Alejandro Vallejo wrote:
>> This warning only makes sense when developing using a compiler with C99
>> support on a codebase meant to be built with C89 compilers too, and
>> that's no longer the case (nor should it be, as it's been 25 years since
>> C99 came out already).
>>
>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>> ---
>> Yes, I'm opening this can of worms. I'd like to hear others people's
>> thoughts on this and whether this is something MISRA has views on. If
>> there's an ulterior non-obvious reason besides stylistic preference I
>> think it should be documented somewhere, but I haven't seen such an
>> explanation.
>>
>> IMO, the presence of this warning causes several undesirable effects:
>>
>>    1. Small functions are hampered by the preclusion of check+declare
>>       patterns that improve readability via concision. e.g: Consider a
>>       silly example like:
>>
>>       /* with warning */                     /* without warning */
>>       void foo(uint8_t *p)                   void foo(uint8_t *p)
>>       {                                      {
>>           uint8_t  tmp1;                         if ( !p )
>>           uint16_t tmp2;                             return;
>>           uint32_t tmp3;
>>                                                  uint8_t  tmp1 = OFFSET1 + *p;
>>           if ( !p )                              uint16_t tmp2 = OFFSET2 + *p;
>>               return;                            uint32_t tmp3 = OFFSET3 + *p;
>>
>>           tmp1 = OFFSET1 + *p;                   /* Lots of uses of `tmpX` */
>>           tmp2 = OFFSET2 + *p;               }
>>           tmp2 = OFFSET2 + *p;
>>
>>           /* Lots of uses of tmpX */
>>       }
>>
>>    2. Promotes scope-creep. On small functions it doesn't matter much,
>>       but on bigger ones to prevent declaring halfway through the body
>>       needlessly increases variable scope to the full scope in which they
>>       are defined rather than the subscope of point-of-declaration to
>>       end-of-current-scope. In cases in which they can be trivially
>>       defined at that point, it also means they can be trivially misused
>>       before they are meant to. i.e: On the example in (1) assume the
>>       conditional in "with warning" is actually a large switch statement.
>>
>>    3. It facilitates a disconnect between time-of-declaration and
>>       time-of-use that lead very easily to "use-before-init" bugs.
>>       While a modern compiler can alleviate the most egregious cases of
>>       this, there's cases it simply cannot cover. A conditional
>>       initialization on anything with external linkage combined with a
>>       conditional use on something else with external linkage will squash
>>       the warning of using an uninitialised variable. Things are worse
>>       where the variable in question is preinitialised to something
>>       credible (e.g: a pointer to NULL), as then that can be misused
>>       between its declaration and its original point of intended use.
>>
>> So... thoughts? yay or nay?
> 
> In my opinion, there are some instances where mixing declarations and
> statements would enhance the code, but these are uncommon. Given the
> choice between:
> 
> 1) declarations always first
> 2) declarations always mixed with statements
> 
> I would choose 1).
> 
> One could say that mixing declarations with statements should be done
> only when appropriate, but the challenge lies in the subjective nature
> of "when appropriate." Different individuals have varying
> interpretations of this, which could lead to unnecessary bikeshedding.
> 
> For these reasons, I do not support mixing declarations and statements.
>
Frediano Ziglio Aug. 13, 2024, 9:53 a.m. UTC | #5
On Fri, Aug 9, 2024 at 2:04 PM Alejandro Vallejo
<alejandro.vallejo@cloud.com> wrote:
>
> This warning only makes sense when developing using a compiler with C99
> support on a codebase meant to be built with C89 compilers too, and
> that's no longer the case (nor should it be, as it's been 25 years since
> C99 came out already).
>
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> Yes, I'm opening this can of worms. I'd like to hear others people's
> thoughts on this and whether this is something MISRA has views on. If
> there's an ulterior non-obvious reason besides stylistic preference I
> think it should be documented somewhere, but I haven't seen such an
> explanation.
>
> IMO, the presence of this warning causes several undesirable effects:
>
>   1. Small functions are hampered by the preclusion of check+declare
>      patterns that improve readability via concision. e.g: Consider a
>      silly example like:
>
>      /* with warning */                     /* without warning */
>      void foo(uint8_t *p)                   void foo(uint8_t *p)
>      {                                      {
>          uint8_t  tmp1;                         if ( !p )
>          uint16_t tmp2;                             return;
>          uint32_t tmp3;
>                                                 uint8_t  tmp1 = OFFSET1 + *p;
>          if ( !p )                              uint16_t tmp2 = OFFSET2 + *p;
>              return;                            uint32_t tmp3 = OFFSET3 + *p;
>
>          tmp1 = OFFSET1 + *p;                   /* Lots of uses of `tmpX` */
>          tmp2 = OFFSET2 + *p;               }
>          tmp2 = OFFSET2 + *p;
>
>          /* Lots of uses of tmpX */
>      }
>
>   2. Promotes scope-creep. On small functions it doesn't matter much,
>      but on bigger ones to prevent declaring halfway through the body
>      needlessly increases variable scope to the full scope in which they
>      are defined rather than the subscope of point-of-declaration to
>      end-of-current-scope. In cases in which they can be trivially
>      defined at that point, it also means they can be trivially misused
>      before they are meant to. i.e: On the example in (1) assume the
>      conditional in "with warning" is actually a large switch statement.
>
>   3. It facilitates a disconnect between time-of-declaration and
>      time-of-use that lead very easily to "use-before-init" bugs.
>      While a modern compiler can alleviate the most egregious cases of
>      this, there's cases it simply cannot cover. A conditional
>      initialization on anything with external linkage combined with a
>      conditional use on something else with external linkage will squash
>      the warning of using an uninitialised variable. Things are worse
>      where the variable in question is preinitialised to something
>      credible (e.g: a pointer to NULL), as then that can be misused
>      between its declaration and its original point of intended use.
>
> So... thoughts? yay or nay?

Hi,
   I personally agree with this change. Even compiler is stating that
this is just C89 compatibility.
Yes, it's question of style and surely having all declaration at the
beginning reduce the options however there are technical reason why
having mixed declarations can help
- you can use "const" to tell compiler (and code editor) that
something should not be changed;
- in many cases reduces commit changes;

> ---
>  xen/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/xen/Makefile b/xen/Makefile
> index 2e1a925c8417..288b7ac8bb2d 100644
> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -394,7 +394,7 @@ CFLAGS-$(CONFIG_CC_SPLIT_SECTIONS) += -ffunction-sections -fdata-sections
>
>  CFLAGS += -nostdinc -fno-builtin -fno-common
>  CFLAGS += -Werror -Wredundant-decls -Wwrite-strings -Wno-pointer-arith
> -CFLAGS += -Wdeclaration-after-statement -Wuninitialized
> +CFLAGS += -Wuninitialized
>  $(call cc-option-add,CFLAGS,CC,-Wvla)
>  $(call cc-option-add,CFLAGS,CC,-Wflex-array-member-not-at-end)
>  $(call cc-option-add,CFLAGS,CC,-Winit-self)

Frediano
diff mbox series

Patch

diff --git a/xen/Makefile b/xen/Makefile
index 2e1a925c8417..288b7ac8bb2d 100644
--- a/xen/Makefile
+++ b/xen/Makefile
@@ -394,7 +394,7 @@  CFLAGS-$(CONFIG_CC_SPLIT_SECTIONS) += -ffunction-sections -fdata-sections
 
 CFLAGS += -nostdinc -fno-builtin -fno-common
 CFLAGS += -Werror -Wredundant-decls -Wwrite-strings -Wno-pointer-arith
-CFLAGS += -Wdeclaration-after-statement -Wuninitialized
+CFLAGS += -Wuninitialized
 $(call cc-option-add,CFLAGS,CC,-Wvla)
 $(call cc-option-add,CFLAGS,CC,-Wflex-array-member-not-at-end)
 $(call cc-option-add,CFLAGS,CC,-Winit-self)