diff mbox series

[v5] compat: auto-detect if zlib has uncompress2()

Message ID xmqqr18x3s5s.fsf@gitster.g (mailing list archive)
State Accepted
Commit 07564773c2569d012719ab9e26b9b27251f3d354
Headers show
Series [v5] compat: auto-detect if zlib has uncompress2() | expand

Commit Message

Junio C Hamano Jan. 24, 2022, 6:27 p.m. UTC
From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

We have a copy of uncompress2() implementation in compat/ so that we
can build with an older version of zlib that lack the function, and
the build procedure selects if it is used via the NO_UNCOMPRESS2
$(MAKE) variable.  This is yet another "annoying" knob the porters
need to tweak on platforms that are not common enough to have the
default set in the config.mak.uname file.

Attempt to instead ask the system header <zlib.h> to decide if we
need the compatibility implementation.  This is a deviation from the
way we have been handling the "compatiblity" features so far, and if
it can be done cleanly enough, it could work as a model for features
that need compatibility definition we discover in the future.  With
that goal in mind, avoid expedient but ugly hacks, like shoving the
code that is conditionally compiled into an unrelated .c file, which
may not work in future cases---instead, take an approach that uses a
file that is independently compiled and stands on its own.

Compile and link compat/zlib-uncompress2.c file unconditionally, but
conditionally hide the implementation behind #if/#endif when zlib
version is 1.2.9 or newer, and unconditionally archive the resulting
object file in the libgit.a to be picked up by the linker.

There are a few things to note in the shape of the code base after
this change:

 - We no longer use NO_UNCOMPRESS2 knob; if the system header
   <zlib.h> claims a version that is more cent than the library
   actually is, this would break, but it is easy to add it back when
   we find such a system.

 - The object file compat/zlib-uncompress2.o is always compiled and
   archived in libgit.a, just like a few other compat/ object files
   already are.

 - The inclusion of <zlib.h> is done in <git-compat-util.h>; we used
   to do so from <cache.h> which includes <git-compat-util.h> as the
   first thing it does, so from the *.c codes, there is no practical
   change.

 - Until objects in libgit.a that is already used gains a reference
   to the function, the reftable code will be the only one that
   wants it, so libgit.a on the linker command line needs to appear
   once more at the end to satisify the mutual dependency.

 - Beat found a trick used by OpenSSL to avoid making the
   conditionally-compiled object truly empty (apparently because
   they had to deal with compilers that do not want to see an
   effectively empty input file).  Our compat/zlib-uncompress2.c
   file borrows the same trick for portabilty.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * With a single-liner update to the Makefile with an updated log
   message that explains the change.  I am not sure if this version
   can become the model of future "compat" support, or we should
   just discard the new approach and use the Makefile macro approach
   that has worked well for all of our existing compat support
   already.

Range-diff against v4:
1:  d26c9073bf ! 1:  63c5753c6f compat: auto-detect if zlib has uncompress2()
    @@ Commit message
         need to tweak on platforms that are not common enough to have the
         default set in the config.mak.uname file.
     
    -    Ævar came up with an idea to instead ask the system header <zlib.h>
    -    to decide if we need the compatibility implementation.  We can
    -    compile and link compat/zlib-uncompress2.c file unconditionally, but
    +    Attempt to instead ask the system header <zlib.h> to decide if we
    +    need the compatibility implementation.  This is a deviation from the
    +    way we have been handling the "compatiblity" features so far, and if
    +    it can be done cleanly enough, it could work as a model for features
    +    that need compatibility definition we discover in the future.  With
    +    that goal in mind, avoid expedient but ugly hacks, like shoving the
    +    code that is conditionally compiled into an unrelated .c file, which
    +    may not work in future cases---instead, take an approach that uses a
    +    file that is independently compiled and stands on its own.
    +
    +    Compile and link compat/zlib-uncompress2.c file unconditionally, but
         conditionally hide the implementation behind #if/#endif when zlib
         version is 1.2.9 or newer, and unconditionally archive the resulting
         object file in the libgit.a to be picked up by the linker.
    @@ Commit message
            first thing it does, so from the *.c codes, there is no practical
            change.
     
    +     - Until objects in libgit.a that is already used gains a reference
    +       to the function, the reftable code will be the only one that
    +       wants it, so libgit.a on the linker command line needs to appear
    +       once more at the end to satisify the mutual dependency.
    +
          - Beat found a trick used by OpenSSL to avoid making the
            conditionally-compiled object truly empty (apparently because
            they had to deal with compilers that do not want to see an
    @@ Makefile: THIRD_PARTY_SOURCES += compat/regex/%
      THIRD_PARTY_SOURCES += sha1dc/%
      
     -GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
    ++# xdiff and reftable libs may in turn depend on what is in libgit.a
     +GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
      EXTLIBS =
      

 Makefile                  | 11 +++--------
 cache.h                   |  1 -
 ci/lib.sh                 |  1 -
 compat/zlib-uncompress2.c | 11 ++++++-----
 config.mak.uname          |  7 -------
 configure.ac              | 13 -------------
 git-compat-util.h         | 12 ++++++++++++
 reftable/system.h         | 11 -----------
 8 files changed, 21 insertions(+), 46 deletions(-)

Comments

Ævar Arnfjörð Bjarmason Jan. 24, 2022, 7:07 p.m. UTC | #1
On Mon, Jan 24 2022, Junio C Hamano wrote:

> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> [...]
> Attempt to instead ask the system header <zlib.h> to decide if we
> need the compatibility implementation.  This is a deviation from the
> way we have been handling the "compatiblity" features so far, and if
> it can be done cleanly enough, it could work as a model for features
> that need compatibility definition we discover in the future.  With
> that goal in mind, avoid expedient but ugly hacks, like shoving the
> code that is conditionally compiled into an unrelated .c file, which
> may not work in future cases---instead, take an approach that uses a
> file that is independently compiled and stands on its own.
> [...]

I think so, we do auto-probes on version (e.g. for libcurl and libpcre),
but I don't think we've outright defined some missing functions on that
basis, we have defined some missing macros though.

FWIW I did some experiments the other day with handling this via the
Makefile.

I.e. similar to how we generate the .depend/* dirs to optionally inject
the building of various "configure" programs into the build step, so
that e.g. git.c would depend on trying to compile a probe/uncompress2.c
or whatever. We'd then write NO_UNCOMPRESS2=$(FOUND_IT) to a file we'd
"include".

I think it's a generally nice thing to move towards, is relatively
light, and would eventually allow us to get rid of the optional
configure.ac (it would become redundant). We could also use it for
e.g. a more accurate "detect-compiler" shellscript (instead of trying to
guess from clang/gcc versions).

>  - Beat found a trick used by OpenSSL to avoid making the
>    conditionally-compiled object truly empty (apparently because
>    they had to deal with compilers that do not want to see an
>    effectively empty input file).  Our compat/zlib-uncompress2.c
>    file borrows the same trick for portabilty.

Aside: I have not yet found such a compiler, does anyone know of one
that breaks? In any case doing this for good measure seems fine, just
wondering if we're cargo-culting a needless workaround or not.

>  * With a single-liner update to the Makefile with an updated log
>    message that explains the change.  I am not sure if this version
>    can become the model of future "compat" support, or we should
>    just discard the new approach and use the Makefile macro approach
>    that has worked well for all of our existing compat support
>    already.
> [...]
>      -GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
>     ++# xdiff and reftable libs may in turn depend on what is in libgit.a
>      +GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
>       EXTLIBS =

As noted I'm rather "who cares? :)" on the question of how we get the
uncompress2() compatibility function over to its eventual location
(directly or via *.o linker indirection), but this approch in v5 works &
addresses the issue I noted in v4. So this looks good to me!

FWIW I've tested this with this ad-hoc patch to simulate not having
uncompress2, and on systems that genuinely don't have it:
    
    diff --git a/compat/zlib-uncompress2.c b/compat/zlib-uncompress2.c
    index 77a1b080484..539c14c2b54 100644
    --- a/compat/zlib-uncompress2.c
    +++ b/compat/zlib-uncompress2.c
    @@ -3 +3 @@
    -#if ZLIB_VERNUM < 0x1290
    +#if 1
    @@ -37 +37 @@
    -int ZEXPORT uncompress2 (
    +int ZEXPORT uncompress3 (
    diff --git a/git-compat-util.h b/git-compat-util.h
    index ea111a7b481..a6df48d76d5 100644
    --- a/git-compat-util.h
    +++ b/git-compat-util.h
    @@ -1392 +1392 @@ void unleak_memory(const void *ptr, size_t len);
    -#if ZLIB_VERNUM < 0x1290
    +#if 1
    @@ -1397 +1397 @@ void unleak_memory(const void *ptr, size_t len);
    -int uncompress2(Bytef *dest, uLongf *destLen, const Bytef *source,
    +int uncompress3(Bytef *dest, uLongf *destLen, const Bytef *source,
    diff --git a/reftable/block.c b/reftable/block.c
    index 855e3f5c947..ee341eb65fe 100644
    --- a/reftable/block.c
    +++ b/reftable/block.c
    @@ -213 +213 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block,
    -		    uncompress2(uncompressed + block_header_skip, &dst_len,
    +		    uncompress3(uncompressed + block_header_skip, &dst_len,
Junio C Hamano Jan. 24, 2022, 8:21 p.m. UTC | #2
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> Aside: I have not yet found such a compiler, does anyone know of one
> that breaks? In any case doing this for good measure seems fine, just
> wondering if we're cargo-culting a needless workaround or not.

Before I started Git, I had to deal with quite a many variations of
UNIX, all of which looked alike but behaved slightly differently,
and I do recall seeing this exact breakage, so it is a real solution
to a real problem, and I can see OpenSSL folks had seen the same one.

If you find my experience is not Enough, I have no further words for
you on this topic.

If the question is "name a compiler that breaks and is *still* in
active use", then the answer would be fuzzy (it depends on the
definition of "in active use"), but is useful to find out.
Carlo Marcelo Arenas Belón Jan. 25, 2022, 10:11 a.m. UTC | #3
On Mon, Jan 24, 2022 at 12:21 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> If the question is "name a compiler that breaks and is *still* in
> active use", then the answer would be fuzzy (it depends on the
> definition of "in active use"), but is useful to find out.

`gcc -pedantic -werror` will abort the build (ISO C forbids an empty
translation unit) because an empty translation unit is not
syntactically correct code per ISO, as well as clang (ISO C requires a
translation unit to contain at least one declaration
[-Wempty-translation-unit]).

Carlo
Junio C Hamano Jan. 25, 2022, 6:11 p.m. UTC | #4
Carlo Arenas <carenas@gmail.com> writes:

> On Mon, Jan 24, 2022 at 12:21 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> If the question is "name a compiler that breaks and is *still* in
>> active use", then the answer would be fuzzy (it depends on the
>> definition of "in active use"), but is useful to find out.
>
> `gcc -pedantic -werror` will abort the build (ISO C forbids an empty
> translation unit) because an empty translation unit is not
> syntactically correct code per ISO, as well as clang (ISO C requires a
> translation unit to contain at least one declaration
> [-Wempty-translation-unit]).

Ah, I remember that one, and it cleanly concludes the thread.
Thanks.
Ævar Arnfjörð Bjarmason Jan. 25, 2022, 7:12 p.m. UTC | #5
On Mon, Jan 24 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> Aside: I have not yet found such a compiler, does anyone know of one
>> that breaks? In any case doing this for good measure seems fine, just
>> wondering if we're cargo-culting a needless workaround or not.
>
> Before I started Git, I had to deal with quite a many variations of
> UNIX, all of which looked alike but behaved slightly differently,
> and I do recall seeing this exact breakage, so it is a real solution
> to a real problem, and I can see OpenSSL folks had seen the same one.
>
> If you find my experience is not Enough, I have no further words for
> you on this topic.

I wasn't alleging that this issue was a figment of someone's
imagination, but probing for whether it was a current issue or
not. I.e. it could have been something that only mattered in the GCC 2.x
era, and OpenSSL was still carrying.

But as Carlo notes downthread it's to do with ISO C strictness and
-Wempty-translation-unit. So we'd have caught it under DEVELOPER=1, but
due to the recent over-strictness of DEVELOPER I'd disabled it on the
test boxes involved, so I didn't spot that.

> If the question is "name a compiler that breaks and is *still* in
> active use", then the answer would be fuzzy (it depends on the
> definition of "in active use"), but is useful to find out.

But as to how to deal with it this is good enough for now, but perhaps
we'll consider something like this for a future compat/*.c addition:
    
    $ file compat/blah.c
    compat/blah.c: empty
    $ diff --git a/Makefile b/Makefile
    index 5da099e8a16..62ec13c72fd 100644
    --- a/Makefile
    +++ b/Makefile
    @@ -571 +571 @@ COMPAT_CFLAGS =
    -COMPAT_OBJS =
    +COMPAT_OBJS = compat/blah.o
    @@ -2604,0 +2605 @@ endif
    +compat/blah.o: EXTRA_CPPFLAGS += -Wno-empty-translation-unit

That will compile cleanly on GCC and Clang with DEVELOPER=1, which of
course may leave some other strictly ISO C compiler unsatisfied.
Junio C Hamano Jan. 26, 2022, 7:23 a.m. UTC | #6
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> That will compile cleanly on GCC and Clang with DEVELOPER=1, which of
> course may leave some other strictly ISO C compiler unsatisfied.

I agree with you that, compared to the solution we borrow from
OpenSSL, it would be of limited utility.  Let's not go there.
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index 5580859afd..f194b2afc6 100644
--- a/Makefile
+++ b/Makefile
@@ -256,8 +256,6 @@  all::
 #
 # Define NO_DEFLATE_BOUND if your zlib does not have deflateBound.
 #
-# Define NO_UNCOMPRESS2 if your zlib does not have uncompress2.
-#
 # Define NO_NORETURN if using buggy versions of gcc 4.6+ and profile feedback,
 # as the compiler can crash (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49299)
 #
@@ -862,6 +860,7 @@  LIB_OBJS += commit-reach.o
 LIB_OBJS += commit.o
 LIB_OBJS += compat/obstack.o
 LIB_OBJS += compat/terminal.o
+LIB_OBJS += compat/zlib-uncompress2.o
 LIB_OBJS += config.o
 LIB_OBJS += connect.o
 LIB_OBJS += connected.o
@@ -1194,7 +1193,8 @@  THIRD_PARTY_SOURCES += compat/regex/%
 THIRD_PARTY_SOURCES += sha1collisiondetection/%
 THIRD_PARTY_SOURCES += sha1dc/%
 
-GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
+# xdiff and reftable libs may in turn depend on what is in libgit.a
+GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
 EXTLIBS =
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
@@ -1726,11 +1726,6 @@  ifdef NO_DEFLATE_BOUND
 	BASIC_CFLAGS += -DNO_DEFLATE_BOUND
 endif
 
-ifdef NO_UNCOMPRESS2
-	BASIC_CFLAGS += -DNO_UNCOMPRESS2
-	REFTABLE_OBJS += compat/zlib-uncompress2.o
-endif
-
 ifdef NO_POSIX_GOODIES
 	BASIC_CFLAGS += -DNO_POSIX_GOODIES
 endif
diff --git a/cache.h b/cache.h
index 281f00ab1b..3a0142aa56 100644
--- a/cache.h
+++ b/cache.h
@@ -18,7 +18,6 @@ 
 #include "repository.h"
 #include "mem-pool.h"
 
-#include <zlib.h>
 typedef struct git_zstream {
 	z_stream z;
 	unsigned long avail_in;
diff --git a/ci/lib.sh b/ci/lib.sh
index 9d28ab50fb..cbc2f8f1ca 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -197,7 +197,6 @@  esac
 case "$jobname" in
 linux32)
 	CC=gcc
-	MAKEFLAGS="$MAKEFLAGS NO_UNCOMPRESS2=1"
 	;;
 linux-musl)
 	CC=gcc
diff --git a/compat/zlib-uncompress2.c b/compat/zlib-uncompress2.c
index 722610b971..77a1b08048 100644
--- a/compat/zlib-uncompress2.c
+++ b/compat/zlib-uncompress2.c
@@ -1,3 +1,6 @@ 
+#include "git-compat-util.h"
+
+#if ZLIB_VERNUM < 0x1290
 /* taken from zlib's uncompr.c
 
    commit cacf7f1d4e3d44d871b605da3b647f07d718623f
@@ -8,16 +11,11 @@ 
 
 */
 
-#include "../reftable/system.h"
-#define z_const
-
 /*
  * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler
  * For conditions of distribution and use, see copyright notice in zlib.h
  */
 
-#include <zlib.h>
-
 /* clang-format off */
 
 /* ===========================================================================
@@ -93,3 +91,6 @@  int ZEXPORT uncompress2 (
 	   err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR :
 	   err;
 }
+#else
+static void *dummy_variable = &dummy_variable;
+#endif
diff --git a/config.mak.uname b/config.mak.uname
index c48db45106..92ea00c219 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -66,7 +66,6 @@  ifeq ($(uname_S),Linux)
 	# centos7/rhel7 provides gcc 4.8.5 and zlib 1.2.7.
 	ifneq ($(findstring .el7.,$(uname_R)),)
 		BASIC_CFLAGS += -std=c99
-		NO_UNCOMPRESS2 = YesPlease
 	endif
 endif
 ifeq ($(uname_S),GNU/kFreeBSD)
@@ -266,10 +265,6 @@  ifeq ($(uname_S),FreeBSD)
 	FILENO_IS_A_MACRO = UnfortunatelyYes
 endif
 ifeq ($(uname_S),OpenBSD)
-	# Versions < 7.0 need compatibility layer
-	ifeq ($(shell expr "$(uname_R)" : "[1-6]\."),2)
-		NO_UNCOMPRESS2 = UnfortunatelyYes
-	endif
 	NO_STRCASESTR = YesPlease
 	NO_MEMMEM = YesPlease
 	USE_ST_TIMESPEC = YesPlease
@@ -525,7 +520,6 @@  ifeq ($(uname_S),Interix)
 	endif
 endif
 ifeq ($(uname_S),Minix)
-	NO_UNCOMPRESS2 = YesPlease
 	NO_IPV6 = YesPlease
 	NO_ST_BLOCKS_IN_STRUCT_STAT = YesPlease
 	NO_NSEC = YesPlease
@@ -581,7 +575,6 @@  ifeq ($(uname_S),NONSTOP_KERNEL)
 	NO_SETENV = YesPlease
 	NO_UNSETENV = YesPlease
 	NO_MKDTEMP = YesPlease
-	NO_UNCOMPRESS2 = YesPlease
 	# Currently libiconv-1.9.1.
 	OLD_ICONV = UnfortunatelyYes
 	NO_REGEX = NeedsStartEnd
diff --git a/configure.ac b/configure.ac
index d60d494ee4..5ee25ec95c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -664,22 +664,9 @@  AC_LINK_IFELSE([ZLIBTEST_SRC],
 	NO_DEFLATE_BOUND=yes])
 LIBS="$old_LIBS"
 
-AC_DEFUN([ZLIBTEST_UNCOMPRESS2_SRC], [
-AC_LANG_PROGRAM([#include <zlib.h>],
- [uncompress2(NULL,NULL,NULL,NULL);])])
-AC_MSG_CHECKING([for uncompress2 in -lz])
-old_LIBS="$LIBS"
-LIBS="$LIBS -lz"
-AC_LINK_IFELSE([ZLIBTEST_UNCOMPRESS2_SRC],
-	[AC_MSG_RESULT([yes])],
-	[AC_MSG_RESULT([no])
-	NO_UNCOMPRESS2=yes])
-LIBS="$old_LIBS"
-
 GIT_UNSTASH_FLAGS($ZLIB_PATH)
 
 GIT_CONF_SUBST([NO_DEFLATE_BOUND])
-GIT_CONF_SUBST([NO_UNCOMPRESS2])
 
 #
 # Define NEEDS_SOCKET if linking with libc is not enough (SunOS,
diff --git a/git-compat-util.h b/git-compat-util.h
index 1229c8296b..ea111a7b48 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1386,6 +1386,18 @@  void unleak_memory(const void *ptr, size_t len);
 #define UNLEAK(var) do {} while (0)
 #endif
 
+#define z_const
+#include <zlib.h>
+
+#if ZLIB_VERNUM < 0x1290
+/*
+ * This is uncompress2, which is only available in zlib >= 1.2.9
+ * (released as of early 2017). See compat/zlib-uncompress2.c.
+ */
+int uncompress2(Bytef *dest, uLongf *destLen, const Bytef *source,
+		uLong *sourceLen);
+#endif
+
 /*
  * This include must come after system headers, since it introduces macros that
  * replace system names.
diff --git a/reftable/system.h b/reftable/system.h
index 4907306c0c..18f9207dfe 100644
--- a/reftable/system.h
+++ b/reftable/system.h
@@ -16,17 +16,6 @@  license that can be found in the LICENSE file or at
 #include "hash.h" /* hash ID, sizes.*/
 #include "dir.h" /* remove_dir_recursively, for tests.*/
 
-#include <zlib.h>
-
-#ifdef NO_UNCOMPRESS2
-/*
- * This is uncompress2, which is only available in zlib >= 1.2.9
- * (released as of early 2017)
- */
-int uncompress2(Bytef *dest, uLongf *destLen, const Bytef *source,
-		uLong *sourceLen);
-#endif
-
 int hash_size(uint32_t id);
 
 #endif