diff mbox series

[RFC] CodingGuidelines: mark external declarations with "extern"

Message ID xmqqtuv3ryhr.fsf_-_@gitster.c.googlers.com (mailing list archive)
State New, archived
Headers show
Series [RFC] CodingGuidelines: mark external declarations with "extern" | expand

Commit Message

Junio C Hamano Oct. 9, 2020, 7:27 p.m. UTC
The document recommends not to write "extern" in front of a function
declaration, with justification that a decl by default is extern.

While it is true that by default decls are extern unless marked with
"static", it does not justify the choice.  It only justifies why we
could omit it if we wanted to, but does not say anything about the
reason why we would want to omit it in the first place.

Recently we saw that past mechanical conversion attempts kept a few
function decls with "extern" in front.  It seems that it was left
because the mechanical conversion tried not to touch external
declaration of pointers to functions, but the pattern used was
faulty and excluded functions that return pointers to functions.

For example, a pointer to a function may look like this:

    extern void (*default_frotz_handler)(int frotz);

We must not omit "extern" from this decl, which says "There exists a
variable whose name is default_frotz_handler, which points at a
function that takes an integer and returns nothing."  It is not a
function declaration and if written without "extern" in front,
requires the "common" extension to be correctly built.

But a function that returns a pointer to a function looks similar:

    extern void (*get_error_routine(void))(const char *message, ...);

which says "There is a get_error_routine() function that takes no
parameters, and it returns a pointer to a function that takes these
parameters and returns nothing".

The current rule tells us not to write "extern" in front, but the
earlier mechanical conversion missed it.  People when writing would
also miss it unless they are careful.

Instead of forcing contributors to spend time on on thinking if they
should or should not write "extern" in front of each of their decl
when they write new code, tell them that our external declarations
always say "extern" in front.  By doing so, we would also prevent a
mistake of not writing "extern" when we need to (i.e. decls of data
items, that are not functions) when less experienced developers try
to mimic how the existing surrounding declarations are written.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/CodingGuidelines | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Comments

Jeff King Oct. 9, 2020, 7:57 p.m. UTC | #1
On Fri, Oct 09, 2020 at 12:27:28PM -0700, Junio C Hamano wrote:

> The document recommends not to write "extern" in front of a function
> declaration, with justification that a decl by default is extern.
> 
> While it is true that by default decls are extern unless marked with
> "static", it does not justify the choice.  It only justifies why we
> could omit it if we wanted to, but does not say anything about the
> reason why we would want to omit it in the first place.

True, but I think that is perhaps a failing of the document, not of the
original motivation. The reason to have _a_ standard is that it is
confusing when some functions say "extern" and some do not (and it was a
continued source of review friction when people pointed out
inconsistencies). But the reason to standardize on omitting it is that
it clutters the line, making it longer and harder to read.

The argument for including it is less clear to me. You say below:

> [...]By doing so, we would also prevent a
> mistake of not writing "extern" when we need to (i.e. decls of data
> items, that are not functions) when less experienced developers try
> to mimic how the existing surrounding declarations are written.

but to my recollection that has not been a big problem. And it's one
that's usually easily caught by the compiler. A missing "extern" on a
variable will usually get you a multiple-definition warning at
link-time (if you manage to also omit the actual definition you won't
see that, though "make sparse" will warn that your variable ought to be
static).

So I have a mild preference to omit it, if we were starting from
scratch. But starting form where we are now, without "extern", my
preference is even stronger, because I don't want to waste submitter or
reviewer time switching all of the existing cases.

> But a function that returns a pointer to a function looks similar:
> 
>     extern void (*get_error_routine(void))(const char *message, ...);

IMHO the real problem here is that C's syntax for returning a function
pointer is so horrendous. How about this (on top of your earlier patch
to drop the extern from that declaration)?

-- >8 --
Subject: [PATCH] usage: define a type for a reporting function

The usage, die, warning, and error routines all work with a function
pointer that takes the message to be reported. We usually just mention
the function's full type inline. But this makes the use of these
pointers hard to read, especially because C's syntax for returning a
function pointer is so awful:

  void (*get_error_routine(void))(const char *err, va_list params);

Unless you read it very carefully, this looks like a function pointer
declaration. Let's instead use a single typedef to define a reporting
function, which is the same for all four types.

Signed-off-by: Jeff King <peff@peff.net>
---
I wondered if the compiler would complain about attaching a noreturn
attribute to an already-typedef'd pointer (rather than including it in
the typedef). gcc seems to be fine with it, but we could also define two
types, one with noreturn and one without.

 git-compat-util.h | 12 +++++++-----
 usage.c           | 18 +++++++++---------
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 56e0e1e79d..adfea06897 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -489,11 +489,13 @@ static inline int const_error(void)
 #define error_errno(...) (error_errno(__VA_ARGS__), const_error())
 #endif
 
-void set_die_routine(NORETURN_PTR void (*routine)(const char *err, va_list params));
-void set_error_routine(void (*routine)(const char *err, va_list params));
-void (*get_error_routine(void))(const char *err, va_list params);
-void set_warn_routine(void (*routine)(const char *warn, va_list params));
-void (*get_warn_routine(void))(const char *warn, va_list params);
+typedef void (*report_fn)(const char *, va_list params);
+
+void set_die_routine(NORETURN_PTR report_fn routine);
+void set_error_routine(report_fn routine);
+report_fn get_error_routine(void);
+void set_warn_routine(report_fn routine);
+report_fn get_warn_routine(void);
 void set_die_is_recursing_routine(int (*routine)(void));
 
 int starts_with(const char *str, const char *prefix);
diff --git a/usage.c b/usage.c
index 58fb5fff5f..06665823a2 100644
--- a/usage.c
+++ b/usage.c
@@ -108,33 +108,33 @@ static int die_is_recursing_builtin(void)
 
 /* If we are in a dlopen()ed .so write to a global variable would segfault
  * (ugh), so keep things static. */
-static NORETURN_PTR void (*usage_routine)(const char *err, va_list params) = usage_builtin;
-static NORETURN_PTR void (*die_routine)(const char *err, va_list params) = die_builtin;
-static void (*error_routine)(const char *err, va_list params) = error_builtin;
-static void (*warn_routine)(const char *err, va_list params) = warn_builtin;
+static NORETURN_PTR report_fn usage_routine = usage_builtin;
+static NORETURN_PTR report_fn die_routine = die_builtin;
+static report_fn error_routine = error_builtin;
+static report_fn warn_routine = warn_builtin;
 static int (*die_is_recursing)(void) = die_is_recursing_builtin;
 
-void set_die_routine(NORETURN_PTR void (*routine)(const char *err, va_list params))
+void set_die_routine(NORETURN_PTR report_fn routine)
 {
 	die_routine = routine;
 }
 
-void set_error_routine(void (*routine)(const char *err, va_list params))
+void set_error_routine(report_fn routine)
 {
 	error_routine = routine;
 }
 
-void (*get_error_routine(void))(const char *err, va_list params)
+report_fn get_error_routine(void)
 {
 	return error_routine;
 }
 
-void set_warn_routine(void (*routine)(const char *warn, va_list params))
+void set_warn_routine(report_fn routine)
 {
 	warn_routine = routine;
 }
 
-void (*get_warn_routine(void))(const char *warn, va_list params)
+report_fn get_warn_routine(void)
 {
 	return warn_routine;
 }
Junio C Hamano Oct. 9, 2020, 8:33 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> The argument for including it is less clear to me. You say below:
>
>> [...]By doing so, we would also prevent a
>> mistake of not writing "extern" when we need to (i.e. decls of data
>> items, that are not functions) when less experienced developers try
>> to mimic how the existing surrounding declarations are written.
>
> but to my recollection that has not been a big problem. And it's one
> that's usually easily caught by the compiler. A missing "extern" on a
> variable will usually get you a multiple-definition warning at
> link-time (if you manage to also omit the actual definition you won't
> see that, though "make sparse" will warn that your variable ought to be
> static).

Not really, that is where the "common" extension comes in, to help
us with it hurt others without it, unknowingly X-<.

        $ cat >a.c <<\EOF
        #include <stdio.h>
        #include "c.h"

        int common = 47;

        int main(int ac, char **av)
        {
            printf("%d\n", common + other);
            return 0;
        }
        EOF
        $ cat >b.c <<\EOF
        #include "c.h"

        int other = 22;
        EOF
        $ cat >c.h <<\EOF
        int common;
        int other;
        EOF
        $ gcc -Wall -o c a.c b.c; ./c
        59

And I have a strong preference, after thinking about it, to have
"extern" in front in the declarations.  It gives another clue for
patterns I feed to "git grep" to latch onto, and help my eyes to
scan and tell decls and defns apart in the output.  The benefit
alone is worth the extra 7 columns in front spent, which you call
"clutter".

> IMHO the real problem here is that C's syntax for returning a function
> pointer is so horrendous. How about this (on top of your earlier patch
> to drop the extern from that declaration)?

In general, I like a typedef for callback function that shortens the
decl of a function that takes such a callback, so I think

> +void set_error_routine(report_fn routine);
> +void set_warn_routine(report_fn routine);
> +report_fn get_error_routine(void);
> +report_fn get_warn_routine(void);

these are good, but they are better with "extern" in front in a
header file to make it clear they are declarations and not
definitions when they appear in "git grep" output.
Denton Liu Oct. 9, 2020, 11 p.m. UTC | #3
Hi Junio,

On Fri, Oct 09, 2020 at 01:33:39PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > The argument for including it is less clear to me. You say below:
> >
> >> [...]By doing so, we would also prevent a
> >> mistake of not writing "extern" when we need to (i.e. decls of data
> >> items, that are not functions) when less experienced developers try
> >> to mimic how the existing surrounding declarations are written.
> >
> > but to my recollection that has not been a big problem. And it's one
> > that's usually easily caught by the compiler. A missing "extern" on a
> > variable will usually get you a multiple-definition warning at
> > link-time (if you manage to also omit the actual definition you won't
> > see that, though "make sparse" will warn that your variable ought to be
> > static).
> 
> Not really, that is where the "common" extension comes in, to help
> us with it hurt others without it, unknowingly X-<.

I'm not really sure what you mean by the "common" extension. 

>         $ cat >a.c <<\EOF
>         #include <stdio.h>
>         #include "c.h"
> 
>         int common = 47;
> 
>         int main(int ac, char **av)
>         {
>             printf("%d\n", common + other);
>             return 0;
>         }
>         EOF
>         $ cat >b.c <<\EOF
>         #include "c.h"
> 
>         int other = 22;
>         EOF
>         $ cat >c.h <<\EOF
>         int common;
>         int other;
>         EOF
>         $ gcc -Wall -o c a.c b.c; ./c
>         59

On gcc 10.2.0, it errors out successfully. Although on clang 10.0.1, it
compiles successfully and produces "69". That being said, I think extern
variables are relatively rare in our codebase and, when it happens, they
usually come as part of lists of other extern variables so a developer
who's mimicking the surrounding code would be able to copy it
successfully. Otherwise, the decl usually pops out in header files as it
is quite unusual.

> And I have a strong preference, after thinking about it, to have
> "extern" in front in the declarations.  It gives another clue for
> patterns I feed to "git grep" to latch onto, and help my eyes to
> scan and tell decls and defns apart in the output.  The benefit
> alone is worth the extra 7 columns in front spent, which you call
> "clutter".

To be honest, I do not have any preference between having the explicit
extern or not. I do have a strong preference, however, for having a
codebase that's consistently written. When I was doing the refactor, I
wouldn't have minded introducing extern everywhere although that wasn't
suggested as an alternative.

I agree that these are all benefits of declaring functions explicitly as
extern. However, I don't think they're worth the cost of either another
huge rewrite or an inconsistent codebase.

> > IMHO the real problem here is that C's syntax for returning a function
> > pointer is so horrendous. How about this (on top of your earlier patch
> > to drop the extern from that declaration)?
> 
> In general, I like a typedef for callback function that shortens the
> decl of a function that takes such a callback, so I think
> 
> > +void set_error_routine(report_fn routine);
> > +void set_warn_routine(report_fn routine);
> > +report_fn get_error_routine(void);
> > +report_fn get_warn_routine(void);
> 
> these are good, but they are better with "extern" in front in a
> header file to make it clear they are declarations and not
> definitions when they appear in "git grep" output.

I agree that this looks a lot better, with or without the extern.
Junio C Hamano Oct. 9, 2020, 11:05 p.m. UTC | #4
Denton Liu <liu.denton@gmail.com> writes:

>> And I have a strong preference, after thinking about it, to have
>> "extern" in front in the declarations.  It gives another clue for
>> patterns I feed to "git grep" to latch onto, and help my eyes to
>> scan and tell decls and defns apart in the output.  The benefit
>> alone is worth the extra 7 columns in front spent, which you call
>> "clutter".
>
> To be honest, I do not have any preference between having the explicit
> extern or not. I do have a strong preference, however, for having a
> codebase that's consistently written. When I was doing the refactor, I
> wouldn't have minded introducing extern everywhere although that wasn't
> suggested as an alternative.
>
> I agree that these are all benefits of declaring functions explicitly as
> extern. However, I don't think they're worth the cost of either another
> huge rewrite or an inconsistent codebase.

Yes, there is a cost associated with having made a mistake in the
past.  Biting the bullet now, perhaps as the first tree-wide change
immediately after the upcoming release, while the tree is quiescent,
would help us in the longer term, than having to live without extern
on declarations.
Đoàn Trần Công Danh Oct. 10, 2020, 12:37 a.m. UTC | #5
On 2020-10-09 13:33:39-0700, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
> 
> > The argument for including it is less clear to me. You say below:
> >
> >> [...]By doing so, we would also prevent a
> >> mistake of not writing "extern" when we need to (i.e. decls of data
> >> items, that are not functions) when less experienced developers try
> >> to mimic how the existing surrounding declarations are written.
> >
> > but to my recollection that has not been a big problem. And it's one
> > that's usually easily caught by the compiler. A missing "extern" on a
> > variable will usually get you a multiple-definition warning at
> > link-time (if you manage to also omit the actual definition you won't
> > see that, though "make sparse" will warn that your variable ought to be
> > static).
> 
> Not really, that is where the "common" extension comes in, to help
> us with it hurt others without it, unknowingly X-<.

Yes, that's where tentative definition jumpes in.
But, tentative definition is known to cause headache to compiler optimization.
And from GCC 10, gcc change to `-fno-common` by default.

We can enable `-fno-common` now if we can detect our compiler is gcc,
but, I don't think it worth to fiddle with Makefile to add that logic.

> 
>         $ cat >a.c <<\EOF
>         #include <stdio.h>
>         #include "c.h"
> 
>         int common = 47;
> 
>         int main(int ac, char **av)
>         {
>             printf("%d\n", common + other);
>             return 0;
>         }
>         EOF
>         $ cat >b.c <<\EOF
>         #include "c.h"
> 
>         int other = 22;
>         EOF
>         $ cat >c.h <<\EOF
>         int common;
>         int other;
>         EOF
>         $ gcc -Wall -o c a.c b.c; ./c
>         59
> 
> And I have a strong preference, after thinking about it, to have
> "extern" in front in the declarations.  It gives another clue for
> patterns I feed to "git grep" to latch onto, and help my eyes to
> scan and tell decls and defns apart in the output.


With this argument, I think adding "extern" is worth it.
It could help people find the code better.
Jeff King Oct. 15, 2020, 1:36 a.m. UTC | #6
On Fri, Oct 09, 2020 at 01:33:39PM -0700, Junio C Hamano wrote:

> >> [...]By doing so, we would also prevent a
> >> mistake of not writing "extern" when we need to (i.e. decls of data
> >> items, that are not functions) when less experienced developers try
> >> to mimic how the existing surrounding declarations are written.
> >
> > but to my recollection that has not been a big problem. And it's one
> > that's usually easily caught by the compiler. A missing "extern" on a
> > variable will usually get you a multiple-definition warning at
> > link-time (if you manage to also omit the actual definition you won't
> > see that, though "make sparse" will warn that your variable ought to be
> > static).
> 
> Not really, that is where the "common" extension comes in, to help
> us with it hurt others without it, unknowingly X-<.

As others noted, gcc 10 actually does complain about this. And we can
easily stick -fno-common into the DEVELOPER knobs, if it's something we
want to catch (I had actually forgotten it wasn't the default).

That said....

> And I have a strong preference, after thinking about it, to have
> "extern" in front in the declarations.  It gives another clue for
> patterns I feed to "git grep" to latch onto, and help my eyes to
> scan and tell decls and defns apart in the output.  The benefit
> alone is worth the extra 7 columns in front spent, which you call
> "clutter".

I still don't like it, and I'm convinced spending any effort to switch
between the two styles is a waste of time. But it's absolutely not the
hill I want to die on, so if you feel strongly, go for it.

> > +void set_error_routine(report_fn routine);
> > +void set_warn_routine(report_fn routine);
> > +report_fn get_error_routine(void);
> > +report_fn get_warn_routine(void);
> 
> these are good, but they are better with "extern" in front in a
> header file to make it clear they are declarations and not
> definitions when they appear in "git grep" output.

I see you picked up my patch as jk/report-fn-typedef, but applied it
directly on v2.28, and not on top of your "drop these extra externs"
patch. That makes sense if we're not going to remove them, but then your
conflict resolution shows my patch as removing them. :)

If we're going to keep them, it should probably leave them in the
existing spots? Or I guess it is OK as-is if you're planning to add them
back in to all of the functions shortly after, not just the ones that
already had extern on them.

-Peff
Junio C Hamano Oct. 15, 2020, 5:15 p.m. UTC | #7
Jeff King <peff@peff.net> writes:

> On Fri, Oct 09, 2020 at 01:33:39PM -0700, Junio C Hamano wrote:
>
>> >> [...]By doing so, we would also prevent a
>> >> mistake of not writing "extern" when we need to (i.e. decls of data
>> >> items, that are not functions) when less experienced developers try
>> >> to mimic how the existing surrounding declarations are written.
>> >
>> > but to my recollection that has not been a big problem. And it's one
>> > that's usually easily caught by the compiler. A missing "extern" on a
>> > variable will usually get you a multiple-definition warning at
>> > link-time (if you manage to also omit the actual definition you won't
>> > see that, though "make sparse" will warn that your variable ought to be
>> > static).
>> 
>> Not really, that is where the "common" extension comes in, to help
>> us with it hurt others without it, unknowingly X-<.
>
> As others noted, gcc 10 actually does complain about this. And we can
> easily stick -fno-common into the DEVELOPER knobs, if it's something we
> want to catch (I had actually forgotten it wasn't the default).

Yup, that is a good thing to do regardless.  I am mostly interested
in seeing "extern" in front of all extern decls (not defns) from
human readers' point of view, though.

> I see you picked up my patch as jk/report-fn-typedef, but applied it
> directly on v2.28, and not on top of your "drop these extra externs"
> patch. That makes sense if we're not going to remove them, but then your
> conflict resolution shows my patch as removing them. :)

I think the patch rearranged that way shows why the new typedef
shines.

It makes it immediately recognisable that set_die_routine() and
friends are functions, not pointers to functions, and because the
reason why "extern" was in front of them was because the last person
who "cleaned up" the header failed to recoginise that they are
functions without the help from this new typedef.

All of that depends on the fact that everybody understands that the
current rule is not to write "extern" in front of external
declaration of functions, so perhaps we may want to update the log
message to mention why "extern" were removed from these two.

Thanks.
Jeff King Oct. 15, 2020, 7:28 p.m. UTC | #8
On Thu, Oct 15, 2020 at 10:15:42AM -0700, Junio C Hamano wrote:

> > As others noted, gcc 10 actually does complain about this. And we can
> > easily stick -fno-common into the DEVELOPER knobs, if it's something we
> > want to catch (I had actually forgotten it wasn't the default).
> 
> Yup, that is a good thing to do regardless.  I am mostly interested
> in seeing "extern" in front of all extern decls (not defns) from
> human readers' point of view, though.

OK, patch incoming.

> > I see you picked up my patch as jk/report-fn-typedef, but applied it
> > directly on v2.28, and not on top of your "drop these extra externs"
> > patch. That makes sense if we're not going to remove them, but then your
> > conflict resolution shows my patch as removing them. :)
> 
> I think the patch rearranged that way shows why the new typedef
> shines.
> 
> It makes it immediately recognisable that set_die_routine() and
> friends are functions, not pointers to functions, and because the
> reason why "extern" was in front of them was because the last person
> who "cleaned up" the header failed to recoginise that they are
> functions without the help from this new typedef.
> 
> All of that depends on the fact that everybody understands that the
> current rule is not to write "extern" in front of external
> declaration of functions, so perhaps we may want to update the log
> message to mention why "extern" were removed from these two.

I added:

  Note that this also removes the "extern" from these declarations to
  match the surrounding functions. They were missed in 554544276a
  (*.[ch]: remove extern from function declarations using spatch,
  2019-04-29) presumably because of the unusual syntax.

which I think clarifies it.

Here's that patch re-rolled, plus the DEVELOPER one.

  [1/2]: usage: define a type for a reporting function
  [2/2]: config.mak.dev: build with -fno-common

 config.mak.dev    |  1 +
 git-compat-util.h | 12 +++++++-----
 usage.c           | 18 +++++++++---------
 3 files changed, 17 insertions(+), 14 deletions(-)

-Peff
diff mbox series

Patch

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 45465bc0c9..eafe41bec8 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -433,10 +433,12 @@  For C programs:
  - Use Git's gettext wrappers to make the user interface
    translatable. See "Marking strings for translation" in po/README.
 
- - Variables and functions local to a given source file should be marked
-   with "static". Variables that are visible to other source files
-   must be declared with "extern" in header files. However, function
-   declarations should not use "extern", as that is already the default.
+ - Variables and functions local to a given source file should be
+   marked with "static". Variables that are visible to other source
+   files must be declared with "extern" in header files.  External
+   function declarations should also use "extern", while external
+   function definition should not, to make it easier to visually tell
+   them apart.
 
  - You can launch gdb around your program using the shorthand GIT_DEBUGGER.
    Run `GIT_DEBUGGER=1 ./bin-wrappers/git foo` to simply use gdb as is, or