diff mbox series

git-compat-util(msvc): C11 does not imply support for zero-sized arrays

Message ID pull.1094.git.1638823724410.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series git-compat-util(msvc): C11 does not imply support for zero-sized arrays | expand

Commit Message

Johannes Schindelin Dec. 6, 2021, 8:48 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

This seems to be required to define `FLEX_ARRAY` in a manner that MSVC
in C11 mode accepts. Without this fix, building Git for Windows'
experimental FSCache code would fail thusly:

	error C2229: struct 'heap_fsentry' has an illegal zero-sized array

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    git-compat-util(msvc): C11 does not imply support for zero-sized arrays
    
    In Git for Windows' continuously-rebased branches, I found this problem
    [https://github.com/git-for-windows/git/runs/4431149285?check_suite_focus=true#step:9:14507]
    (which uses FLEX_ARRAY correctly, but fails because FLEX_ARRAY is no
    longer defined as required by MS Visual C).
    
    This patch is based on bc/require-c99.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1094%2Fdscho%2Fflex-array-and-msvc-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1094/dscho/flex-array-and-msvc-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1094

 git-compat-util.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


base-commit: 7bc341e21b566c6685b7d993ca80459f9994be38

Comments

Junio C Hamano Dec. 6, 2021, 9:48 p.m. UTC | #1
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 19943e214ba..c9f508b3a83 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -46,7 +46,7 @@
>  /*
>   * See if our compiler is known to support flexible array members.
>   */
> -#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) && (!defined(__SUNPRO_C) || (__SUNPRO_C > 0x580))
> +#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) && !defined(_MSC_VER) && (!defined(__SUNPRO_C) || (__SUNPRO_C > 0x580))
>  # define FLEX_ARRAY /* empty */
>  #elif defined(__GNUC__)
>  # if (__GNUC__ >= 3)

Is flex array a requirement for STDC 199901L or newer?

I am assuming it is (e.g. https://en.wikipedia.org/wiki/C99), and I
can see that it is an unreliable source of information, given that
we see the second vendor whose conformance statement __STDC_VERSION__
makes contradicts with what the compiler does.

It might be that future MSC starts supporting flex array and this
line will become even harder to grok when it happens.  I'd rather
see us cleaning up the mess first.

Specifically, I am wondering if we should use the test based on
__STDC_VERSION__ as the last resort, giving the precedence to more
vendor specific tests, to avoid future problems.  That way, we can
cater to both camps of compiler vendors, the ones where their
STDC_VERSION may not say they have C99 but they do support flex
array the modern way, and the others where their STDC_VERSION say
they have C99 but the don't do flex array.

How about a preliminary clean-up patch that brings us to the
preimage of the following patchlet in step [1/2]?  Then we can do
the single-liner addition of _MSC_VER to support you, and the end
result would be a lot easier to read and maintain?

    /* Vendor specific exceptions first */
    #if   defined(__SUNPRO_C) && (__SUNPRO_C <= 0x580)
   +#elif defined(_MSC_VER)
    #elif defined(__GNUC__)
    # if (__GNUC__ >= 3)
    #  define FLEX_ARRAY /* empty */
    # else
    #  define FLEX_ARRAY 0 /* older GNU extension */
    # endif
    #elif defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L)
    # define FLEX_ARRAY /* empty */
    #endif

    /* Otherwise default to safer but wasteful */
    #ifndef FLEX_ARRAY
    # define FLEX_ARRAY 1
    #endif

Thanks.
Neeraj Singh Dec. 6, 2021, 10:25 p.m. UTC | #2
I'm a little confused by this issue. Looks like an empty flex array
is supported here: https://godbolt.org/z/j3ndYTEfx. 
(Note that I'm passing /TC to force C compilation).

Also, heap_fsentry doesn't appear to use a flex array in current sources.

If it does start using it, there issue may actually be elsewhere besides
the struct definition (it could be just a badly targeted compiler error).
We have code like `struct heap_fsentry key[2];`.  That declaration can't
work with a flex array.

Thanks,
Neeraj
(wearing the ex-MSVC dev hat)
Johannes Schindelin Dec. 7, 2021, 9:33 p.m. UTC | #3
Hi Neeraj,

On Mon, 6 Dec 2021, Neeraj Singh wrote:

> I'm a little confused by this issue. Looks like an empty flex array
> is supported here: https://godbolt.org/z/j3ndYTEfx.

It is somewhat strange, but understandable, that the empty flex array at
the end of a struct isn't triggering a compile error. But having a field
_after_ that empty flex array _does_ trigger a compile error:

struct MyStruct {
    int x;
    int flexA[];
    char string[256];
};

Note the added `string` field.

See https://godbolt.org/z/TbcYhEW5d, it says:

	<source>(5): error C2229: struct 'MyStruct' has an illegal zero-sized array
	Compiler returned: 2

> (Note that I'm passing /TC to force C compilation).

Yes, `/TC` is one of the flags we pass to MSVC. For more details, see
https://github.com/git-for-windows/git/runs/4389081542?check_suite_focus=true#step:9:125

> Also, heap_fsentry doesn't appear to use a flex array in current sources.

It does, but it is admittedly a bit convoluted and not very easy to see.
The definition of `heap_fsentry` is
[this](https://github.com/git-for-windows/git/blob/v2.34.1.windows.1/compat/win32/fscache.c#L77-L80):

	struct heap_fsentry {
		struct fsentry ent;
		char dummy[MAX_LONG_PATH];
	};

No flex array here, right? But wait, there is a `struct fsentry`. Let's
look at
[that](https://github.com/git-for-windows/git/blob/v2.34.1.windows.1/compat/win32/fscache.c#L43-L74):

	struct fsentry {
		struct hashmap_entry ent;
		[...]
		/*
		 * Name of the entry. For directory listings: relative path of the
		 * directory, without trailing '/' (empty for cwd()). For file
		 * entries:
		 * name of the file. Typically points to the end of the structure
		 * if
		 * the fsentry is allocated on the heap (see fsentry_alloc), or to
		 * a
		 * local variable if on the stack (see fsentry_init).
		 */
		struct dirent dirent;
	};

Still no flex array, right? But wait, there is a `struct dirent`. Let's
[see](https://github.com/git-for-windows/git/blob/v2.34.1.windows.1/compat/win32/dirent.h#L9-L12):

	struct dirent {
		unsigned char d_type; /* file type to prevent lstat after readdir */
		char d_name[FLEX_ARRAY]; /* file name */
	};

Finally! We see the flex array.

Now, you may ask why is this even correct? How can you have an empty-sized
field in a struct that is inside another struct that is inside yet another
struct _and then followed by another field_?

The reason why this is correct and intended is that `struct dirent`
intentionally leaves the length of the `d_name` undefined, to leave it to
the implementation whether a fixed-size buffer is used or a
specifically-allocated one of the exact correct size for a _specific_
directory entry.

In FSCache, we want to allocate a large-enough buffer to fit _any_ file
name, and it should not only contain the metadata in `struct dirent`, but
additionally some FSCache-specific metadata.

Therefore, `struct fsentry` is kind of a subclass of `struct dirent`, and
`struct heap_fsentry` is kind of a subclass of something that does not
exist, a `struct dirent` that offers enough space to fit _any_ legal
`d_name` (that is what that `dummy` field is for, it is not actually
intended to be accessed except via `d_name`).

> If it does start using it, there issue may actually be elsewhere besides
> the struct definition (it could be just a badly targeted compiler error).
> We have code like `struct heap_fsentry key[2];`.  That declaration can't
> work with a flex array.

I hope my explanation above made sense to you.

Admittedly, it is slightly icky code, but honestly, I do not have any
splendid idea how to make it less complicated to understand. Do you?

Ciao,
Dscho
Neeraj Singh Dec. 7, 2021, 10:22 p.m. UTC | #4
On Tue, Dec 7, 2021 at 1:33 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Neeraj,
>
> On Mon, 6 Dec 2021, Neeraj Singh wrote:
>
> > I'm a little confused by this issue. Looks like an empty flex array
> > is supported here: https://godbolt.org/z/j3ndYTEfx.
>
> It is somewhat strange, but understandable, that the empty flex array at
> the end of a struct isn't triggering a compile error. But having a field
> _after_ that empty flex array _does_ trigger a compile error:
>
> struct MyStruct {
>     int x;
>     int flexA[];
>     char string[256];
> };
>
> Note the added `string` field.
>

Please see https://godbolt.org/z/4Tb9PobYM.  GCC throws a morally
equivalent error.  I don't think that's a valid usage of flex-array.

> See https://godbolt.org/z/TbcYhEW5d, it says:
>
>         <source>(5): error C2229: struct 'MyStruct' has an illegal zero-sized array
>         Compiler returned: 2
>
> > (Note that I'm passing /TC to force C compilation).
>
> Yes, `/TC` is one of the flags we pass to MSVC. For more details, see
> https://github.com/git-for-windows/git/runs/4389081542?check_suite_focus=true#step:9:125
>
> > Also, heap_fsentry doesn't appear to use a flex array in current sources.
>
> It does, but it is admittedly a bit convoluted and not very easy to see.
> The definition of `heap_fsentry` is
> [this](https://github.com/git-for-windows/git/blob/v2.34.1.windows.1/compat/win32/fscache.c#L77-L80):
>
>         struct heap_fsentry {
>                 struct fsentry ent;
>                 char dummy[MAX_LONG_PATH];
>         };
>
> No flex array here, right? But wait, there is a `struct fsentry`. Let's
> look at
> [that](https://github.com/git-for-windows/git/blob/v2.34.1.windows.1/compat/win32/fscache.c#L43-L74):
>
>         struct fsentry {
>                 struct hashmap_entry ent;
>                 [...]
>                 /*
>                  * Name of the entry. For directory listings: relative path of the
>                  * directory, without trailing '/' (empty for cwd()). For file
>                  * entries:
>                  * name of the file. Typically points to the end of the structure
>                  * if
>                  * the fsentry is allocated on the heap (see fsentry_alloc), or to
>                  * a
>                  * local variable if on the stack (see fsentry_init).
>                  */
>                 struct dirent dirent;
>         };
>
> Still no flex array, right? But wait, there is a `struct dirent`. Let's
> [see](https://github.com/git-for-windows/git/blob/v2.34.1.windows.1/compat/win32/dirent.h#L9-L12):
>
>         struct dirent {
>                 unsigned char d_type; /* file type to prevent lstat after readdir */
>                 char d_name[FLEX_ARRAY]; /* file name */
>         };
>
> Finally! We see the flex array.
>
> Now, you may ask why is this even correct? How can you have an empty-sized
> field in a struct that is inside another struct that is inside yet another
> struct _and then followed by another field_?
>
> The reason why this is correct and intended is that `struct dirent`
> intentionally leaves the length of the `d_name` undefined, to leave it to
> the implementation whether a fixed-size buffer is used or a
> specifically-allocated one of the exact correct size for a _specific_
> directory entry.
>
> In FSCache, we want to allocate a large-enough buffer to fit _any_ file
> name, and it should not only contain the metadata in `struct dirent`, but
> additionally some FSCache-specific metadata.
>
> Therefore, `struct fsentry` is kind of a subclass of `struct dirent`, and
> `struct heap_fsentry` is kind of a subclass of something that does not
> exist, a `struct dirent` that offers enough space to fit _any_ legal
> `d_name` (that is what that `dummy` field is for, it is not actually
> intended to be accessed except via `d_name`).
>
> > If it does start using it, there issue may actually be elsewhere besides
> > the struct definition (it could be just a badly targeted compiler error).
> > We have code like `struct heap_fsentry key[2];`.  That declaration can't
> > work with a flex array.
>
> I hope my explanation above made sense to you.
>
> Admittedly, it is slightly icky code, but honestly, I do not have any
> splendid idea how to make it less complicated to understand. Do you?

Thanks for explaining all of this.  It was hard for me to see what was
going on before.

So when trying the same thing with Clang, this construct is claimed to
be a GNU extension: https://godbolt.org/z/q3ndr57Pf

The fix I'd propose (uncomment the line defining FIXED_DEF in godbolt)
is to use a union where the char buf has the size of the fsentry
_plus_ the desired buffer.

Thanks,
Neeraj
Randall S. Becker Dec. 7, 2021, 10:59 p.m. UTC | #5
On December 7, 2021 5:22 PM, Neeraj Singh wrote:
> On Tue, Dec 7, 2021 at 1:33 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >
> > Hi Neeraj,
> >
> > On Mon, 6 Dec 2021, Neeraj Singh wrote:
> >
> > > I'm a little confused by this issue. Looks like an empty flex array
> > > is supported here: https://godbolt.org/z/j3ndYTEfx.
> >
> > It is somewhat strange, but understandable, that the empty flex array
> > at the end of a struct isn't triggering a compile error. But having a
> > field _after_ that empty flex array _does_ trigger a compile error:
> >
> > struct MyStruct {
> >     int x;
> >     int flexA[];
> >     char string[256];
> > };
> >
> > Note the added `string` field.
> >
> 
> Please see https://godbolt.org/z/4Tb9PobYM.  GCC throws a morally
> equivalent error.  I don't think that's a valid usage of flex-array.
> 
> > See https://godbolt.org/z/TbcYhEW5d, it says:
> >
> >         <source>(5): error C2229: struct 'MyStruct' has an illegal zero-sized
> array
> >         Compiler returned: 2
> >
> > > (Note that I'm passing /TC to force C compilation).
> >
> > Yes, `/TC` is one of the flags we pass to MSVC. For more details, see
> > https://github.com/git-for-windows/git/runs/4389081542?check_suite_foc
> > us=true#step:9:125
> >
> > > Also, heap_fsentry doesn't appear to use a flex array in current sources.
> >
> > It does, but it is admittedly a bit convoluted and not very easy to see.
> > The definition of `heap_fsentry` is
> > [this](https://github.com/git-for-
> windows/git/blob/v2.34.1.windows.1/compat/win32/fscache.c#L77-L80):
> >
> >         struct heap_fsentry {
> >                 struct fsentry ent;
> >                 char dummy[MAX_LONG_PATH];
> >         };
> >
> > No flex array here, right? But wait, there is a `struct fsentry`.
> > Let's look at
> > [that](https://github.com/git-for-
> windows/git/blob/v2.34.1.windows.1/compat/win32/fscache.c#L43-L74):
> >
> >         struct fsentry {
> >                 struct hashmap_entry ent;
> >                 [...]
> >                 /*
> >                  * Name of the entry. For directory listings: relative path of the
> >                  * directory, without trailing '/' (empty for cwd()). For file
> >                  * entries:
> >                  * name of the file. Typically points to the end of the structure
> >                  * if
> >                  * the fsentry is allocated on the heap (see fsentry_alloc), or to
> >                  * a
> >                  * local variable if on the stack (see fsentry_init).
> >                  */
> >                 struct dirent dirent;
> >         };
> >
> > Still no flex array, right? But wait, there is a `struct dirent`.
> > Let's
> > [see](https://github.com/git-for-
> windows/git/blob/v2.34.1.windows.1/compat/win32/dirent.h#L9-L12):
> >
> >         struct dirent {
> >                 unsigned char d_type; /* file type to prevent lstat after readdir */
> >                 char d_name[FLEX_ARRAY]; /* file name */
> >         };
> >
> > Finally! We see the flex array.
> >
> > Now, you may ask why is this even correct? How can you have an
> > empty-sized field in a struct that is inside another struct that is
> > inside yet another struct _and then followed by another field_?
> >
> > The reason why this is correct and intended is that `struct dirent`
> > intentionally leaves the length of the `d_name` undefined, to leave it
> > to the implementation whether a fixed-size buffer is used or a
> > specifically-allocated one of the exact correct size for a _specific_
> > directory entry.
> >
> > In FSCache, we want to allocate a large-enough buffer to fit _any_
> > file name, and it should not only contain the metadata in `struct
> > dirent`, but additionally some FSCache-specific metadata.
> >
> > Therefore, `struct fsentry` is kind of a subclass of `struct dirent`,
> > and `struct heap_fsentry` is kind of a subclass of something that does
> > not exist, a `struct dirent` that offers enough space to fit _any_
> > legal `d_name` (that is what that `dummy` field is for, it is not
> > actually intended to be accessed except via `d_name`).
> >
> > > If it does start using it, there issue may actually be elsewhere
> > > besides the struct definition (it could be just a badly targeted compiler
> error).
> > > We have code like `struct heap_fsentry key[2];`.  That declaration
> > > can't work with a flex array.
> >
> > I hope my explanation above made sense to you.
> >
> > Admittedly, it is slightly icky code, but honestly, I do not have any
> > splendid idea how to make it less complicated to understand. Do you?
> 
> Thanks for explaining all of this.  It was hard for me to see what was going on
> before.
> 
> So when trying the same thing with Clang, this construct is claimed to be a
> GNU extension: https://godbolt.org/z/q3ndr57Pf
> 
> The fix I'd propose (uncomment the line defining FIXED_DEF in godbolt) is to
> use a union where the char buf has the size of the fsentry _plus_ the desired
> buffer.

I don't know which compiler versions support this check, but it is customary to have a union with the largest allocation first, then smaller allocations following in separate sub-union fields. This may not match your intent.

-Randall
Junio C Hamano Dec. 9, 2021, 1:39 a.m. UTC | #6
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> This seems to be required to define `FLEX_ARRAY` in a manner that MSVC
> in C11 mode accepts. Without this fix, building Git for Windows'
> experimental FSCache code would fail thusly:

So, what's the final verdict on this one, after a few back and forth
to clarify the "seems to be required" above?

In the meantime, I am tempted to queue the following as a pure
clean-up.

Thoughts?

----- >8 --------- >8 --------- >8 --------- >8 --------- >8 -----
Subject: [PATCH] flex-array: simplify compiler-specific workaround

We use "type array[];" syntax for the flex-array member at the end
of a struct under C99 or later, except when we are building with
older SUNPRO_C compilers.  As we find more vendor compilers that
claim to grok C99 but not understand  ts flex-array syntax, the
existing "If we are using C99, but not with these compilers..."
conditional will keep growing.

Make it more manageable by listing vendor-specific exceptions
earlier, with the expectation that new exceptions will not be
combined into existing ones to make the condition longer, and
instead will be implemented as a new "#elif" in the cascade of
#if/#elif/#else/#endif.  E.g. if MSC is found to have a quirk
similar to old SUNPRO_C, we can just add a single line 

    #elif defined(_MSC_VER)

immediately before "#elif defined(__GNUC__)" to cause us to fallback
to the safer but a bit wasteful version.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 git-compat-util.h | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git c/git-compat-util.h w/git-compat-util.h
index c6bd2a84e5..9ba047a58e 100644
--- c/git-compat-util.h
+++ w/git-compat-util.h
@@ -33,14 +33,23 @@
 /*
  * See if our compiler is known to support flexible array members.
  */
-#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) && (!defined(__SUNPRO_C) || (__SUNPRO_C > 0x580))
-# define FLEX_ARRAY /* empty */
+
+/*
+ * Check vendor specific quirks first, before checking the
+ * __STDC_VERSION__, as vendor compilers can lie and we need to be
+ * able to work them around.  Note that by not defining FLEX_ARRAY
+ * here, we can fall back to use the "safer but a bit wasteful" one
+ * later.
+ */
+#if defined(__SUNPRO_C) && (__SUNPRO_C <= 0x580)
 #elif defined(__GNUC__)
 # if (__GNUC__ >= 3)
 #  define FLEX_ARRAY /* empty */
 # else
 #  define FLEX_ARRAY 0 /* older GNU extension */
 # endif
+#elif defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L)
+# define FLEX_ARRAY /* empty */
 #endif
 
 /*
Neeraj Singh Dec. 9, 2021, 1:49 a.m. UTC | #7
On Wed, Dec 08, 2021 at 05:39:39PM -0800, Junio C Hamano wrote:
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
> 
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > This seems to be required to define `FLEX_ARRAY` in a manner that MSVC
> > in C11 mode accepts. Without this fix, building Git for Windows'
> > experimental FSCache code would fail thusly:
> 
> So, what's the final verdict on this one, after a few back and forth
> to clarify the "seems to be required" above?
> 
> In the meantime, I am tempted to queue the following as a pure
> clean-up.
> 
> Thoughts?
> 
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 -----
> Subject: [PATCH] flex-array: simplify compiler-specific workaround
> 
> We use "type array[];" syntax for the flex-array member at the end
> of a struct under C99 or later, except when we are building with
> older SUNPRO_C compilers.  As we find more vendor compilers that
> claim to grok C99 but not understand  ts flex-array syntax, the
> existing "If we are using C99, but not with these compilers..."
> conditional will keep growing.
> 
> Make it more manageable by listing vendor-specific exceptions
> earlier, with the expectation that new exceptions will not be
> combined into existing ones to make the condition longer, and
> instead will be implemented as a new "#elif" in the cascade of
> #if/#elif/#else/#endif.  E.g. if MSC is found to have a quirk
> similar to old SUNPRO_C, we can just add a single line 
> 
>     #elif defined(_MSC_VER)
> 
> immediately before "#elif defined(__GNUC__)" to cause us to fallback
> to the safer but a bit wasteful version.
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  git-compat-util.h | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git c/git-compat-util.h w/git-compat-util.h
> index c6bd2a84e5..9ba047a58e 100644
> --- c/git-compat-util.h
> +++ w/git-compat-util.h
> @@ -33,14 +33,23 @@
>  /*
>   * See if our compiler is known to support flexible array members.
>   */
> -#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) && (!defined(__SUNPRO_C) || (__SUNPRO_C > 0x580))
> -# define FLEX_ARRAY /* empty */
> +
> +/*
> + * Check vendor specific quirks first, before checking the
> + * __STDC_VERSION__, as vendor compilers can lie and we need to be
> + * able to work them around.  Note that by not defining FLEX_ARRAY
> + * here, we can fall back to use the "safer but a bit wasteful" one
> + * later.
> + */
> +#if defined(__SUNPRO_C) && (__SUNPRO_C <= 0x580)
>  #elif defined(__GNUC__)
>  # if (__GNUC__ >= 3)
>  #  define FLEX_ARRAY /* empty */
>  # else
>  #  define FLEX_ARRAY 0 /* older GNU extension */
>  # endif
> +#elif defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L)
> +# define FLEX_ARRAY /* empty */
>  #endif
>  
>  /*
> 
> 

Your change is an improvement that should be easier to read and avoid conflicts when adding quirks for compilers.

LGTM.

I hope on Dscho's side that they change the fscache code to use the union, since I don't believe the current code
is C-standards compliant and at least in this case MSVC is not actually broken.

-Neeraj
Phillip Wood Dec. 9, 2021, 11 a.m. UTC | #8
Hi Dscho

On 07/12/2021 21:33, Johannes Schindelin wrote:
> Hi Neeraj,
> 
> On Mon, 6 Dec 2021, Neeraj Singh wrote:
> 
>> I'm a little confused by this issue. Looks like an empty flex array
>> is supported here: https://godbolt.org/z/j3ndYTEfx.
> 
> It is somewhat strange, but understandable, that the empty flex array at
> the end of a struct isn't triggering a compile error. But having a field
> _after_ that empty flex array _does_ trigger a compile error:
> 
> struct MyStruct {
>      int x;
>      int flexA[];
>      char string[256];
> };
> 
> Note the added `string` field.

Having a field after the flex array is a violation of the C standard. 
Section 6.7.2.1:
    ... the last member of a structure with more than one named member
    may have incomplete array type, such a structure (and any union
    containing, possibly recursively, a member that is such a structure)
    shall not be a member of a structure or an element of an array.

Note that the wording also forbids

struct A {
	int x;
	char flex[];
};

struct B {
	struct A a; /* This is forbidden */
};

There was a proposal a few years ago to relax that restriction [1] but 
it does not seem to be in the latest draft standard.

None of this helps fix the problem, but it does explain why MSVC complains.

Best Wishes

Phillip

[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2083.htm

> See https://godbolt.org/z/TbcYhEW5d, it says:
> 
> 	<source>(5): error C2229: struct 'MyStruct' has an illegal zero-sized array
> 	Compiler returned: 2
> 
>> (Note that I'm passing /TC to force C compilation).
> 
> Yes, `/TC` is one of the flags we pass to MSVC. For more details, see
> https://github.com/git-for-windows/git/runs/4389081542?check_suite_focus=true#step:9:125
> 
>> Also, heap_fsentry doesn't appear to use a flex array in current sources.
> 
> It does, but it is admittedly a bit convoluted and not very easy to see.
> The definition of `heap_fsentry` is
> [this](https://github.com/git-for-windows/git/blob/v2.34.1.windows.1/compat/win32/fscache.c#L77-L80):
> 
> 	struct heap_fsentry {
> 		struct fsentry ent;
> 		char dummy[MAX_LONG_PATH];
> 	};
> 
> No flex array here, right? But wait, there is a `struct fsentry`. Let's
> look at
> [that](https://github.com/git-for-windows/git/blob/v2.34.1.windows.1/compat/win32/fscache.c#L43-L74):
> 
> 	struct fsentry {
> 		struct hashmap_entry ent;
> 		[...]
> 		/*
> 		 * Name of the entry. For directory listings: relative path of the
> 		 * directory, without trailing '/' (empty for cwd()). For file
> 		 * entries:
> 		 * name of the file. Typically points to the end of the structure
> 		 * if
> 		 * the fsentry is allocated on the heap (see fsentry_alloc), or to
> 		 * a
> 		 * local variable if on the stack (see fsentry_init).
> 		 */
> 		struct dirent dirent;
> 	};
> 
> Still no flex array, right? But wait, there is a `struct dirent`. Let's
> [see](https://github.com/git-for-windows/git/blob/v2.34.1.windows.1/compat/win32/dirent.h#L9-L12):
> 
> 	struct dirent {
> 		unsigned char d_type; /* file type to prevent lstat after readdir */
> 		char d_name[FLEX_ARRAY]; /* file name */
> 	};
> 
> Finally! We see the flex array.
> 
> Now, you may ask why is this even correct? How can you have an empty-sized
> field in a struct that is inside another struct that is inside yet another
> struct _and then followed by another field_?
> 
> The reason why this is correct and intended is that `struct dirent`
> intentionally leaves the length of the `d_name` undefined, to leave it to
> the implementation whether a fixed-size buffer is used or a
> specifically-allocated one of the exact correct size for a _specific_
> directory entry.
> 
> In FSCache, we want to allocate a large-enough buffer to fit _any_ file
> name, and it should not only contain the metadata in `struct dirent`, but
> additionally some FSCache-specific metadata.
> 
> Therefore, `struct fsentry` is kind of a subclass of `struct dirent`, and
> `struct heap_fsentry` is kind of a subclass of something that does not
> exist, a `struct dirent` that offers enough space to fit _any_ legal
> `d_name` (that is what that `dummy` field is for, it is not actually
> intended to be accessed except via `d_name`).
> 
>> If it does start using it, there issue may actually be elsewhere besides
>> the struct definition (it could be just a badly targeted compiler error).
>> We have code like `struct heap_fsentry key[2];`.  That declaration can't
>> work with a flex array.
> 
> I hope my explanation above made sense to you.
> 
> Admittedly, it is slightly icky code, but honestly, I do not have any
> splendid idea how to make it less complicated to understand. Do you?
> 
> Ciao,
> Dscho
>
Junio C Hamano Dec. 9, 2021, 9:18 p.m. UTC | #9
Phillip Wood <phillip.wood123@gmail.com> writes:

>> the end of a struct isn't triggering a compile error. But having a field
>> _after_ that empty flex array _does_ trigger a compile error:
>> struct MyStruct {
>>      int x;
>>      int flexA[];
>>      char string[256];
>> };
>> Note the added `string` field.

Ick.

If we use the "safer but a bit wasteful" alternative in such a case,
it will become just wrong.  We'll have a single-member array there,
and extra allocation for an instance of MyStruct will only lengthen
the .string member, without allowing the .flexA member to have more
elements.

> Having a field after the flex array is a violation of the C
> standard. Section 6.7.2.1:
>    ... the last member of a structure with more than one named member
>    may have incomplete array type, such a structure (and any union
>    containing, possibly recursively, a member that is such a structure)
>    shall not be a member of a structure or an element of an array.
>
> Note that the wording also forbids
>
> struct A {
> 	int x;
> 	char flex[];
> };
>
> struct B {
> 	struct A a; /* This is forbidden */
> };
>
> There was a proposal a few years ago to relax that restriction [1] but
> it does not seem to be in the latest draft standard.
>
> None of this helps fix the problem, but it does explain why MSVC complains.

Thanks.

In short, the code is wrong, and the compiler is complaining to
avoid us using a wrong code.  If the same code is given to other
compilers that support ANCI C style flexible array member, we are
likely to see the same error.

Which makes the whole thing (including my "let's make the
conditionals to set up FLEX_ARRAY for various compilers easier to
read") much less urgent.

Thanks for a well-written summary.
diff mbox series

Patch

diff --git a/git-compat-util.h b/git-compat-util.h
index 19943e214ba..c9f508b3a83 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -46,7 +46,7 @@ 
 /*
  * See if our compiler is known to support flexible array members.
  */
-#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) && (!defined(__SUNPRO_C) || (__SUNPRO_C > 0x580))
+#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) && !defined(_MSC_VER) && (!defined(__SUNPRO_C) || (__SUNPRO_C > 0x580))
 # define FLEX_ARRAY /* empty */
 #elif defined(__GNUC__)
 # if (__GNUC__ >= 3)