mbox series

[v5,0/5] Introduce libgit-rs, a Rust wrapper around libgit.a

Message ID cover.1729032373.git.steadmon@google.com (mailing list archive)
Headers show
Series Introduce libgit-rs, a Rust wrapper around libgit.a | expand

Message

Josh Steadmon Oct. 15, 2024, 10:50 p.m. UTC
A quick note: normally I'd wait longer for more feedback before sending
out a new reroll, but due to some firefighting at $DAYJOB, I will not
have time to focus on this series for the next several weeks, possibly
up to a month. I wanted to get V5 out before then, but please understand
that I will not be able to address new feedback for a while.

This series provides two small Rust wrapper libraries around parts of
Git: "libgit-sys", which exposes a few functions from libgit.a, and
"libgit", which provides a more Rust-friendly interface to some of those
functions. In addition to included unit tests, at $DAYJOB we have tested
building JJ[1] with our library and used it to replace some of the
libgit2-rs uses.

[1] https://github.com/martinvonz/jj

Please find V1 cover letter here:
https://lore.kernel.org/git/cover.1723054623.git.steadmon@google.com/

Known NEEDSWORK:

* Investigate alternative methods of managing symbol visibility &
  renaming.

* Figure out symbol versioning

Changes in V5:
* When building with INCLUDE_LIBGIT_RS defined, add
  "-fvisibility=hidden" to CFLAGS. This allows us to manage symbol
  visibility in libgitpub.a without causing `make all` to rebuild from
  scratch due to changing CFLAGS.

* Avoid using c_int in the higher-level Rust API.

* Remove libgitpub.a and intermediate files with `make clean`.


Changes in V4:
* Drop V3 patch #3, which added wrappers around repository
  initialization and config access. These are not well-libified, and
  they are not necessary for JJ's proof-of-concept use case, so let's
  avoid exporting them for now.

* Set a minimum supported Rust version of 1.63. Autodetect whether our
  Rust version has c_int and c_char types; if not, define them
  ourselves.

* When building libgitpub.a via build.rs, set DEVELOPER=1 to catch
  additional errors at build time.

* In build.rs, use the make_cmd crate to portable select the correct
  invocation of GNU Make.

* Follow naming standards for _alloc() and _free() functions.

* Use String instead of CString in higher-level API.

* Move libgit_configset_alloc() and libgit_configset_free() out of
  upstream Git, to the libgitpub shim library.

* In libgitpub, initialize libgit_config_set structs in the _alloc()
  function rather than with a separate _init() function.

* Remove unnecessary comments in libgit-sys showing where the wrapped
  functions were originally defined.

* Fix clippy lint: don't reborrow configfile path references.

* Various typo fixes and `cargo fmt` fixes.

Changes in V3:
* Renamed cgit-rs to libgit-rs and cgit-sys to libgit-sys

* Makefile cleanup, particularly adding config.mak options that
  developers can set to run Rust builds and tests by default (Patch 6)

* Provide testdata configs for unit tests

* ConfigSet API now uses &Path instead of &str -- more ergonomic for
  Rust users to pass in and errors out if the path string isn't UTF-8

* Fixed unresolved dependency on libz in Cargo.toml


Calvin Wan (2):
  libgit: add higher-level libgit crate
  Makefile: add option to build and test libgit-rs and libgit-rs-sys

Josh Steadmon (3):
  common-main: split init and exit code into new files
  libgit-sys: introduce Rust wrapper for libgit.a
  libgit-sys: also export some config_set functions

 .gitignore                                    |  2 +
 Makefile                                      | 44 +++++++++
 common-exit.c                                 | 26 +++++
 common-init.c                                 | 63 ++++++++++++
 common-init.h                                 |  6 ++
 common-main.c                                 | 83 +---------------
 contrib/libgit-rs/Cargo.lock                  | 77 +++++++++++++++
 contrib/libgit-rs/Cargo.toml                  | 15 +++
 contrib/libgit-rs/build.rs                    |  4 +
 contrib/libgit-rs/libgit-sys/Cargo.lock       | 69 ++++++++++++++
 contrib/libgit-rs/libgit-sys/Cargo.toml       | 17 ++++
 contrib/libgit-rs/libgit-sys/README.md        | 15 +++
 contrib/libgit-rs/libgit-sys/build.rs         | 35 +++++++
 .../libgit-sys/public_symbol_export.c         | 50 ++++++++++
 .../libgit-sys/public_symbol_export.h         | 18 ++++
 contrib/libgit-rs/libgit-sys/src/lib.rs       | 79 +++++++++++++++
 contrib/libgit-rs/src/lib.rs                  | 95 +++++++++++++++++++
 contrib/libgit-rs/testdata/config1            |  2 +
 contrib/libgit-rs/testdata/config2            |  2 +
 contrib/libgit-rs/testdata/config3            |  2 +
 t/Makefile                                    | 16 ++++
 21 files changed, 639 insertions(+), 81 deletions(-)
 create mode 100644 common-exit.c
 create mode 100644 common-init.c
 create mode 100644 common-init.h
 create mode 100644 contrib/libgit-rs/Cargo.lock
 create mode 100644 contrib/libgit-rs/Cargo.toml
 create mode 100644 contrib/libgit-rs/build.rs
 create mode 100644 contrib/libgit-rs/libgit-sys/Cargo.lock
 create mode 100644 contrib/libgit-rs/libgit-sys/Cargo.toml
 create mode 100644 contrib/libgit-rs/libgit-sys/README.md
 create mode 100644 contrib/libgit-rs/libgit-sys/build.rs
 create mode 100644 contrib/libgit-rs/libgit-sys/public_symbol_export.c
 create mode 100644 contrib/libgit-rs/libgit-sys/public_symbol_export.h
 create mode 100644 contrib/libgit-rs/libgit-sys/src/lib.rs
 create mode 100644 contrib/libgit-rs/src/lib.rs
 create mode 100644 contrib/libgit-rs/testdata/config1
 create mode 100644 contrib/libgit-rs/testdata/config2
 create mode 100644 contrib/libgit-rs/testdata/config3

Range-diff against v4:
4:  2ed503216f ! 1:  1ae14207f6 Makefile: add option to build and test libgit-rs and libgit-rs-sys
    @@
      ## Metadata ##
    -Author: Calvin Wan <calvinwan@google.com>
    +Author: Josh Steadmon <steadmon@google.com>
     
      ## Commit message ##
    -    Makefile: add option to build and test libgit-rs and libgit-rs-sys
    +    common-main: split init and exit code into new files
     
    -    Add libgitrs, libgitrs-sys, libgitrs-test, and libgitrs-sys-test targets
    -    to their respective Makefiles so they can be built and tested without
    -    having to run cargo build/test.
    +    Currently, object files in libgit.a reference common_exit(), which is
    +    contained in common-main.o. However, common-main.o also includes main(),
    +    which references cmd_main() in git.o, which in turn depends on all the
    +    builtin/*.o objects.
     
    -    Add environment variable, INCLUDE_LIBGIT_RS, that when set,
    -    automatically builds and tests libgit-rs and libgit-rs-sys when `make
    -    all` is ran.
    +    We would like to allow external users to link libgit.a without needing
    +    to include so many extra objects. Enable this by splitting common_exit()
    +    and check_bug_if_BUG() into a new file common-exit.c, and add
    +    common-exit.o to LIB_OBJS so that these are included in libgit.a.
     
    -    Signed-off-by: Calvin Wan <calvinwan@google.com>
    +    This split has previously been proposed ([1], [2]) to support fuzz tests
    +    and unit tests by avoiding conflicting definitions for main(). However,
    +    both of those issues were resolved by other methods of avoiding symbol
    +    conflicts. Now we are trying to make libgit.a more self-contained, so
    +    hopefully we can revisit this approach.
    +
    +    Additionally, move the initialization code out of main() into a new
    +    init_git() function in its own file. Include this in libgit.a as well,
    +    so that external users can share our setup code without calling our
    +    main().
    +
    +    [1] https://lore.kernel.org/git/Yp+wjCPhqieTku3X@google.com/
    +    [2] https://lore.kernel.org/git/20230517-unit-tests-v2-v2-1-21b5b60f4b32@google.com/
    +
    +    Signed-off-by: Josh Steadmon <steadmon@google.com>
     
      ## Makefile ##
    -@@ Makefile: build-unit-tests: $(UNIT_TEST_PROGS)
    - unit-tests: $(UNIT_TEST_PROGS) t/helper/test-tool$X
    - 	$(MAKE) -C t/ unit-tests
    +@@ Makefile: LIB_OBJS += combine-diff.o
    + LIB_OBJS += commit-graph.o
    + LIB_OBJS += commit-reach.o
    + LIB_OBJS += commit.o
    ++LIB_OBJS += common-exit.o
    ++LIB_OBJS += common-init.o
    + LIB_OBJS += compat/nonblock.o
    + LIB_OBJS += compat/obstack.o
    + LIB_OBJS += compat/terminal.o
    +
    + ## common-exit.c (new) ##
    +@@
    ++#include "git-compat-util.h"
    ++#include "trace2.h"
    ++
    ++static void check_bug_if_BUG(void)
    ++{
    ++	if (!bug_called_must_BUG)
    ++		return;
    ++	BUG("on exit(): had bug() call(s) in this process without explicit BUG_if_bug()");
    ++}
    ++
    ++/* We wrap exit() to call common_exit() in git-compat-util.h */
    ++int common_exit(const char *file, int line, int code)
    ++{
    ++	/*
    ++	 * For non-POSIX systems: Take the lowest 8 bits of the "code"
    ++	 * to e.g. turn -1 into 255. On a POSIX system this is
    ++	 * redundant, see exit(3) and wait(2), but as it doesn't harm
    ++	 * anything there we don't need to guard this with an "ifdef".
    ++	 */
    ++	code &= 0xff;
    ++
    ++	check_bug_if_BUG();
    ++	trace2_cmd_exit_fl(file, line, code);
    ++
    ++	return code;
    ++}
    +
    + ## common-init.c (new) ##
    +@@
    ++#define USE_THE_REPOSITORY_VARIABLE
    ++
    ++#include "git-compat-util.h"
    ++#include "common-init.h"
    ++#include "exec-cmd.h"
    ++#include "gettext.h"
    ++#include "attr.h"
    ++#include "repository.h"
    ++#include "setup.h"
    ++#include "strbuf.h"
    ++#include "trace2.h"
    ++
    ++/*
    ++ * Many parts of Git have subprograms communicate via pipe, expect the
    ++ * upstream of a pipe to die with SIGPIPE when the downstream of a
    ++ * pipe does not need to read all that is written.  Some third-party
    ++ * programs that ignore or block SIGPIPE for their own reason forget
    ++ * to restore SIGPIPE handling to the default before spawning Git and
    ++ * break this carefully orchestrated machinery.
    ++ *
    ++ * Restore the way SIGPIPE is handled to default, which is what we
    ++ * expect.
    ++ */
    ++static void restore_sigpipe_to_default(void)
    ++{
    ++	sigset_t unblock;
    ++
    ++	sigemptyset(&unblock);
    ++	sigaddset(&unblock, SIGPIPE);
    ++	sigprocmask(SIG_UNBLOCK, &unblock, NULL);
    ++	signal(SIGPIPE, SIG_DFL);
    ++}
    ++
    ++void init_git(const char **argv)
    ++{
    ++	struct strbuf tmp = STRBUF_INIT;
    ++
    ++	trace2_initialize_clock();
    ++
    ++	/*
    ++	 * Always open file descriptors 0/1/2 to avoid clobbering files
    ++	 * in die().  It also avoids messing up when the pipes are dup'ed
    ++	 * onto stdin/stdout/stderr in the child processes we spawn.
    ++	 */
    ++	sanitize_stdfds();
    ++	restore_sigpipe_to_default();
    ++
    ++	git_resolve_executable_dir(argv[0]);
    ++
    ++	setlocale(LC_CTYPE, "");
    ++	git_setup_gettext();
    ++
    ++	initialize_repository(the_repository);
    ++
    ++	attr_start();
    ++
    ++	trace2_initialize();
    ++	trace2_cmd_start(argv);
    ++	trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP);
    ++
    ++	if (!strbuf_getcwd(&tmp))
    ++		tmp_original_cwd = strbuf_detach(&tmp, NULL);
    ++}
    +
    + ## common-init.h (new) ##
    +@@
    ++#ifndef COMMON_INIT_H
    ++#define COMMON_INIT_H
    ++
    ++void init_git(const char **argv);
    ++
    ++#endif /* COMMON_INIT_H */
    +
    + ## common-main.c ##
    +@@
    +-#define USE_THE_REPOSITORY_VARIABLE
    +-
    + #include "git-compat-util.h"
    +-#include "exec-cmd.h"
    +-#include "gettext.h"
    +-#include "attr.h"
    +-#include "repository.h"
    +-#include "setup.h"
    +-#include "strbuf.h"
    +-#include "trace2.h"
    +-
    +-/*
    +- * Many parts of Git have subprograms communicate via pipe, expect the
    +- * upstream of a pipe to die with SIGPIPE when the downstream of a
    +- * pipe does not need to read all that is written.  Some third-party
    +- * programs that ignore or block SIGPIPE for their own reason forget
    +- * to restore SIGPIPE handling to the default before spawning Git and
    +- * break this carefully orchestrated machinery.
    +- *
    +- * Restore the way SIGPIPE is handled to default, which is what we
    +- * expect.
    +- */
    +-static void restore_sigpipe_to_default(void)
    +-{
    +-	sigset_t unblock;
    +-
    +-	sigemptyset(&unblock);
    +-	sigaddset(&unblock, SIGPIPE);
    +-	sigprocmask(SIG_UNBLOCK, &unblock, NULL);
    +-	signal(SIGPIPE, SIG_DFL);
    +-}
    ++#include "common-init.h"
      
    -+.PHONY: libgitrs-sys
    -+libgitrs-sys:
    -+	$(QUIET)(\
    -+		cd contrib/libgit-rs/libgit-sys && \
    -+		cargo build \
    -+	)
    -+.PHONY: libgitrs
    -+libgitrs:
    -+	$(QUIET)(\
    -+		cd contrib/libgit-rs && \
    -+		cargo build \
    -+	)
    -+ifdef INCLUDE_LIBGIT_RS
    -+all:: libgitrs
    -+endif
    -+
    - contrib/libgit-rs/libgit-sys/partial_symbol_export.o: contrib/libgit-rs/libgit-sys/public_symbol_export.o libgit.a reftable/libreftable.a xdiff/lib.a
    - 	$(LD) -r $^ -o $@
    + int main(int argc, const char **argv)
    + {
    + 	int result;
    +-	struct strbuf tmp = STRBUF_INIT;
    +-
    +-	trace2_initialize_clock();
    +-
    +-	/*
    +-	 * Always open file descriptors 0/1/2 to avoid clobbering files
    +-	 * in die().  It also avoids messing up when the pipes are dup'ed
    +-	 * onto stdin/stdout/stderr in the child processes we spawn.
    +-	 */
    +-	sanitize_stdfds();
    +-	restore_sigpipe_to_default();
    +-
    +-	git_resolve_executable_dir(argv[0]);
    +-
    +-	setlocale(LC_CTYPE, "");
    +-	git_setup_gettext();
    +-
    +-	initialize_repository(the_repository);
    +-
    +-	attr_start();
    +-
    +-	trace2_initialize();
    +-	trace2_cmd_start(argv);
    +-	trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP);
    +-
    +-	if (!strbuf_getcwd(&tmp))
    +-		tmp_original_cwd = strbuf_detach(&tmp, NULL);
      
    -
    - ## t/Makefile ##
    -@@ t/Makefile: perf:
    ++	init_git(argv);
    + 	result = cmd_main(argc, argv);
      
    - .PHONY: pre-clean $(T) aggregate-results clean valgrind perf \
    - 	check-chainlint clean-chainlint test-chainlint $(UNIT_TESTS)
    -+
    -+.PHONY: libgitrs-sys-test
    -+libgitrs-sys-test:
    -+	$(QUIET)(\
    -+		cd ../contrib/libgit-rs/libgit-sys && \
    -+		cargo test \
    -+	)
    -+.PHONY: libgitrs-test
    -+libgitrs-test:
    -+	$(QUIET)(\
    -+		cd ../contrib/libgit-rs && \
    -+		cargo test \
    -+	)
    -+ifdef INCLUDE_LIBGIT_RS
    -+all:: libgitrs-sys-test libgitrs-test
    -+endif
    + 	/* Not exit(3), but a wrapper calling our common_exit() */
    + 	exit(result);
    + }
    +-
    +-static void check_bug_if_BUG(void)
    +-{
    +-	if (!bug_called_must_BUG)
    +-		return;
    +-	BUG("on exit(): had bug() call(s) in this process without explicit BUG_if_bug()");
    +-}
    +-
    +-/* We wrap exit() to call common_exit() in git-compat-util.h */
    +-int common_exit(const char *file, int line, int code)
    +-{
    +-	/*
    +-	 * For non-POSIX systems: Take the lowest 8 bits of the "code"
    +-	 * to e.g. turn -1 into 255. On a POSIX system this is
    +-	 * redundant, see exit(3) and wait(2), but as it doesn't harm
    +-	 * anything there we don't need to guard this with an "ifdef".
    +-	 */
    +-	code &= 0xff;
    +-
    +-	check_bug_if_BUG();
    +-	trace2_cmd_exit_fl(file, line, code);
    +-
    +-	return code;
    +-}
1:  ed925d6a33 ! 2:  1ed010c378 libgit-sys: introduce Rust wrapper for libgit.a
    @@ Makefile: clean: profile-clean coverage-clean cocciclean
      	$(MAKE) -C Documentation/ clean
      	$(RM) Documentation/GIT-EXCLUDED-PROGRAMS
     +	$(RM) -r contrib/libgit-rs/libgit-sys/target
    ++	$(RM) -r contrib/libgit-rs/libgit-sys/partial_symbol_export.o
    ++	$(RM) -r contrib/libgit-rs/libgit-sys/hidden_symbol_export.o
    ++	$(RM) -r contrib/libgit-rs/libgit-sys/libgitpub.a
      ifndef NO_PERL
      	$(RM) -r perl/build/
      endif
2:  8eeab7b418 = 3:  00762b57d0 libgit-sys: also export some config_set functions
3:  29599e9c7b ! 4:  4e5360931b libgit: add higher-level libgit crate
    @@ Metadata
      ## Commit message ##
         libgit: add higher-level libgit crate
     
    -    Wrap `struct config_set` and a few of its associated functions in
    -    libgit-sys. Also introduce a higher-level "libgit" crate which provides
    -    a more Rust-friendly interface to config_set structs.
    +    The C functions exported by libgit-sys do not provide an idiomatic Rust
    +    interface. To make it easier to use these functions via Rust, add a
    +    higher-level "libgit" crate, that wraps the lower-level configset API
    +    with an interface that is more Rust-y.
    +
    +    This combination of $X and $X-sys crates is a common pattern for FFI in
    +    Rust, as documented in "The Cargo Book" [1].
    +
    +    [1] https://doc.rust-lang.org/cargo/reference/build-scripts.html#-sys-packages
     
         Co-authored-by: Josh Steadmon <steadmon@google.com>
         Signed-off-by: Josh Steadmon <steadmon@google.com>
    @@ Makefile: clean: profile-clean coverage-clean cocciclean
      	$(RM) Documentation/GIT-EXCLUDED-PROGRAMS
     -	$(RM) -r contrib/libgit-rs/libgit-sys/target
     +	$(RM) -r contrib/libgit-rs/target contrib/libgit-rs/libgit-sys/target
    - ifndef NO_PERL
    - 	$(RM) -r perl/build/
    - endif
    + 	$(RM) -r contrib/libgit-rs/libgit-sys/partial_symbol_export.o
    + 	$(RM) -r contrib/libgit-rs/libgit-sys/hidden_symbol_export.o
    + 	$(RM) -r contrib/libgit-rs/libgit-sys/libgitpub.a
     
      ## contrib/libgit-rs/Cargo.lock (new) ##
     @@
    @@ contrib/libgit-rs/src/lib.rs (new)
     +        }
     +    }
     +
    -+    pub fn get_int(&mut self, key: &str) -> Option<c_int> {
    ++    pub fn get_int(&mut self, key: &str) -> Option<i32> {
     +        let key = CString::new(key).expect("Couldn't convert to CString");
     +        let mut val: c_int = 0;
     +        unsafe {
    @@ contrib/libgit-rs/src/lib.rs (new)
     +            }
     +        }
     +
    -+        Some(val)
    ++        Some(val.into())
     +    }
     +
     +    pub fn get_string(&mut self, key: &str) -> Option<String> {
-:  ---------- > 5:  15ce989de8 Makefile: add option to build and test libgit-rs and libgit-rs-sys

base-commit: 557ae147e6cdc9db121269b058c757ac5092f9c9