[RFC,0/2] add an external testing library for unit tests

Message ID	20230427175007.902278-1-calvinwan@google.com (mailing list archive)
Headers	show Return-Path: <git-owner@vger.kernel.org> Date: Thu, 27 Apr 2023 17:50:05 +0000 Mime-Version: 1.0 Message-ID: <20230427175007.902278-1-calvinwan@google.com> Subject: [RFC PATCH 0/2] add an external testing library for unit tests From: Calvin Wan <calvinwan@google.com> To: git@vger.kernel.org Cc: Calvin Wan <calvinwan@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	add an external testing library for unit tests \| expand [RFC,0/2] add an external testing library for unit tests [RFC,1/2] Add C TAP harness [RFC,2/2] unit test: add basic example

Calvin Wan April 27, 2023, 5:50 p.m. UTC

In our current testing environment, we spend a significant amount of
effort crafting end-to-end tests for error conditions that could easily
be captured by unit tests (or we simply forgo some hard-to-setup and
rare error conditions). Unit tests additionally provide stability to the
codebase and can simplify debugging through isolation. Turning parts of
Git into libraries[1] gives us the ability to run unit tests on the
libraries and to write unit tests in C. Writing unit tests in pure C,
rather than with our current shell/test-tool helper setup, simplifies
test setup, simplifies passing data around (no shell-isms required), and
reduces testing runtime by not spawning a separate process for every
test invocation.

Unit testing in C requires a separate testing harness that we ideally
would like to be TAP-style and to come with a non-restrictive license.
Fortunately, there already exists a C TAP harness library[2] with an MIT
license (at least for the files included in this series).

This first patch introduces the C TAP harness and includes only the
necessary files. The second patch showcases a basic example of it. As an
RFC, I am wondering what the list thinks about using a second testing
library for unit testing? Are there any problems with this particular
external testing library and does it provide the necessary functionality
we would want for unit testing Git libraries? How should the folders be
structured and how should the new Makefile rules be written? Ultimately,
this will help us determine the setup of our unit tests in future
libification patches.

[1] https://lore.kernel.org/git/CAJoAoZ=Cig_kLocxKGax31sU7Xe4==BGzC__Bg2_pr7krNq6MA@mail.gmail.com/
[2] https://github.com/rra/c-tap-harness/

Calvin Wan (2):
Add C TAP harness
unit test: add basic example

Makefile | 11 +
t/TESTS | 1 +
t/runtests.c | 1789 ++++++++++++++++++++++++++++++++++++++++++++++++
t/tap/basic.c | 1029 ++++++++++++++++++++++++++++
t/tap/basic.h | 198 ++++++
t/tap/macros.h | 109 +++
t/unit-test.c | 14 +
7 files changed, 3151 insertions(+)
create mode 100644 t/TESTS
create mode 100644 t/runtests.c
create mode 100644 t/tap/basic.c
create mode 100644 t/tap/basic.h
create mode 100644 t/tap/macros.h
create mode 100644 t/unit-test.c

Junio C Hamano April 27, 2023, 6:39 p.m. UTC | #1

Calvin Wan <calvinwan@google.com> writes:

> ... Unit tests additionally provide stability to the
> codebase and can simplify debugging through isolation. Turning parts of
> Git into libraries[1] gives us the ability to run unit tests on the
> libraries and to write unit tests in C. Writing unit tests in pure C,
> rather than with our current shell/test-tool helper setup, simplifies
> test setup, simplifies passing data around (no shell-isms required), and
> reduces testing runtime by not spawning a separate process for every
> test invocation.

Good goal, except that unit tests are not panacea---our ultimate
obligation is to give a stable behaviour to the end users and
end-to-end testing is still needed.  We would benefit from having
tests at both levels.

> Unit testing in C requires a separate testing harness that we ideally
> would like to be TAP-style and to come with a non-restrictive license.
>
> Fortunately, there already exists a C TAP harness library[2] with an MIT
> license (at least for the files included in this series). 

Yup.  Consistency with our existing test framework would make it easier
to adopt for all of us.  Good goal.

Calvin Wan April 27, 2023, 6:46 p.m. UTC | #2

On Thu, Apr 27, 2023 at 11:40 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Calvin Wan <calvinwan@google.com> writes:
>
> > ... Unit tests additionally provide stability to the
> > codebase and can simplify debugging through isolation. Turning parts of
> > Git into libraries[1] gives us the ability to run unit tests on the
> > libraries and to write unit tests in C. Writing unit tests in pure C,
> > rather than with our current shell/test-tool helper setup, simplifies
> > test setup, simplifies passing data around (no shell-isms required), and
> > reduces testing runtime by not spawning a separate process for every
> > test invocation.
>
> Good goal, except that unit tests are not panacea---our ultimate
> obligation is to give a stable behaviour to the end users and
> end-to-end testing is still needed.  We would benefit from having
> tests at both levels.

Ah I seem to have misworded this section a bit. I didn't mean to say
that writing unit tests in C would replace our current shell/test-tool
setup. I meant to make the comparison between writing unit tests in C
vs writing unit tests with our current shell/test-tool setup. And I
agree that we would benefit from having
tests at both levels.

brian m. carlson April 27, 2023, 9:35 p.m. UTC | #3

On 2023-04-27 at 17:50:05, Calvin Wan wrote:
> In our current testing environment, we spend a significant amount of
> effort crafting end-to-end tests for error conditions that could easily
> be captured by unit tests (or we simply forgo some hard-to-setup and
> rare error conditions). Unit tests additionally provide stability to the
> codebase and can simplify debugging through isolation. Turning parts of
> Git into libraries[1] gives us the ability to run unit tests on the
> libraries and to write unit tests in C. Writing unit tests in pure C,
> rather than with our current shell/test-tool helper setup, simplifies
> test setup, simplifies passing data around (no shell-isms required), and
> reduces testing runtime by not spawning a separate process for every
> test invocation.

I think this is a good idea.  Like Junio said downthread, we need to
have integration and end-to-end tests, and I think those will continue
to compose the majority of our tests.  However, having simple and easy
to use unit tests would be really valuable for testing things like our
hash and HMAC implementations, as well as a variety of other library
functions, including our strbuf code.

At work, I recently ported a project from C with no unit tests to Rust
with unit tests (and in both cases, our existing integration and
functional testsuite) and found that writing unit tests let us have
substantially more confidence in the correct functioning of our code.

I think it's great that we're using existing TAP functionality as well.

If you're looking for some proof-of-concept projects to illustrate why
this is useful in v1, might I suggest some of the subsystems above?
Assuming it lands, I plan on sending some tests for the percent-encoding
in the strbuf code and some more aggressive testing of our block SHA-256
and HMAC implementations if nobody gets to it before me.  (Mostly
because this is stuff I wrote or touched and would like to have more
confidence in.)

Felipe Contreras May 2, 2023, 4:18 a.m. UTC | #4

Calvin Wan wrote:
> In our current testing environment, we spend a significant amount of
> effort crafting end-to-end tests for error conditions that could easily
> be captured by unit tests (or we simply forgo some hard-to-setup and
> rare error conditions). Unit tests additionally provide stability to the
> codebase and can simplify debugging through isolation. Turning parts of
> Git into libraries[1] gives us the ability to run unit tests on the
> libraries and to write unit tests in C. Writing unit tests in pure C,
> rather than with our current shell/test-tool helper setup, simplifies
> test setup, simplifies passing data around (no shell-isms required), and
> reduces testing runtime by not spawning a separate process for every
> test invocation.

I agree unit tests would be very helpful, but they don't need to be written in C.

I sent a RFC patch series [1] attempting to write unit tests, but using Ruby, the
testing framework is less than 100 lines of code, and uses Ruby bindings to use
the C functions.

I think writing everything in C adds a ton of unnecessary complexity for no
gain. In fact some things are simply not possible, like dealing with crashes
without forks.

Modern languages exist for a reason: C isn't the best tool in every situation,
and I believe this is one of them.

Cheers.

[1] https://lore.kernel.org/git/20230502041113.103385-1-felipe.contreras@gmail.com/T/#t

Ævar Arnfjörð Bjarmason May 2, 2023, 1:52 p.m. UTC | #5

On Thu, Apr 27 2023, Calvin Wan wrote:

> In our current testing environment, we spend a significant amount of
> effort crafting end-to-end tests for error conditions that could easily
> be captured by unit tests (or we simply forgo some hard-to-setup and
> rare error conditions). Unit tests additionally provide stability to the
> codebase and can simplify debugging through isolation. Turning parts of
> Git into libraries[1] gives us the ability to run unit tests on the
> libraries and to write unit tests in C. Writing unit tests in pure C,
> rather than with our current shell/test-tool helper setup, simplifies
> test setup, simplifies passing data around (no shell-isms required), and
> reduces testing runtime by not spawning a separate process for every
> test invocation.
>
> Unit testing in C requires a separate testing harness that we ideally
> would like to be TAP-style and to come with a non-restrictive license.
> Fortunately, there already exists a C TAP harness library[2] with an MIT
> license (at least for the files included in this series). 
>
> This first patch introduces the C TAP harness and includes only the
> necessary files. The second patch showcases a basic example of it. As an
> RFC, I am wondering what the list thinks about using a second testing
> library for unit testing? Are there any problems with this particular
> external testing library and does it provide the necessary functionality
> we would want for unit testing Git libraries? How should the folders be
> structured and how should the new Makefile rules be written? Ultimately,
> this will help us determine the setup of our unit tests in future
> libification patches.
>
> [1] https://lore.kernel.org/git/CAJoAoZ=Cig_kLocxKGax31sU7Xe4==BGzC__Bg2_pr7krNq6MA@mail.gmail.com/
> [2] https://github.com/rra/c-tap-harness/ 

I have some out-of-tree patches I've been meaning to submit that massage
some of our TAP output, and I'd really prefer if we don't end up with
two TAP emitters in-tree if we can help it.

We can support such a thing, but nothing about your goals or your
explanation here provides the "why".

Or rather, I'm not really buying the "passing data around" or "recuding
[...] runtime [overhead]". I think you're describing how *some* of our
*.sh to *.c interop goes, but we have test-lib.sh driving C code without
those issues.

We already have pure-C libraries that we add a really shim to unit test,
the most extreme example of this is t0032-reftable-unittest.sh, whose
body is simply (excluding comments):

	#!/bin/sh
	test_description='reftable unittests'

	TEST_PASSES_SANITIZE_LEAK=true
	. ./test-lib.sh

	test_expect_success 'unittests' '
		TMPDIR=$(pwd) && export TMPDIR &&
		test-tool reftable
	'

	test_done

Now, that goes into reftable/test_framework.h which basically implements
its own mini-test framework, so that's at least a *partial* argument for
what you're suggesting here, but note that it doesn't emit TAP, it just
returns an exit code, the EXPECT() etc. is purely internal. I.e. "what
should we return?".

Probably a more git-y example is t0071-sort.sh, whose body is similar
(skipping most of the boilerplate):

	test_expect_success 'DEFINE_LIST_SORT_DEBUG' '
		test-tool mergesort test
	'

We then have similar library tests as e.g. t0063-string-list.sh,
t0061-run-command.sh, t3070-wildmatch.sh etc.

None of those are perfect, but I think the current arrangement is rather
ideal. We can write most or all of the test in C, but we just do so by
calling a function that returns an exit code.

It does mean we need to spawn a shell for test-lib.sh, and call a
test-tool at least once, but the overhead of that is trivial. It's not
trivial in some cases where we call the helper in a loop, but that's
much more easily addressed than hoisting all of TAP generation into the
C space.

Felipe Contreras May 2, 2023, 3:28 p.m. UTC | #6

Ævar Arnfjörð Bjarmason wrote:

> We already have pure-C libraries that we add a really shim to unit test,
> the most extreme example of this is t0032-reftable-unittest.sh, whose
> body is simply (excluding comments):
> 	
> 	#!/bin/sh
> 	test_description='reftable unittests'
> 	
> 	TEST_PASSES_SANITIZE_LEAK=true
> 	. ./test-lib.sh
> 	
> 	test_expect_success 'unittests' '
> 		TMPDIR=$(pwd) && export TMPDIR &&
> 		test-tool reftable
> 	'
> 	
> 	test_done

Yeah, but an output of 'ok 1 - unittests' is not very useful, neither is an
output of 'not ok 1 - unittests'.

This completely misses the point of a TAP interface, which is to parse the
status of individual test cases, or even individual assertions.

If all we are doing is check the exit code of some program, then we don't need
TAP.

> Now, that goes into reftable/test_framework.h which basically implements
> its own mini-test framework, so that's at least a *partial* argument for
> what you're suggesting here, but note that it doesn't emit TAP, it just
> returns an exit code, the EXPECT() etc. is purely internal. I.e. "what
> should we return?".

You are just describing the status quo.

I think this is a naturaistic fallacy: confusing what is versus what ought to
be.

Can we test C code with our current testing framework? Yes.

Should we? That's the actual question.

> None of those are perfect, but I think the current arrangement is rather
> ideal.

I think misuing TAP is far from ideal.

In my view an ideal framework would:

 1. Be able to test C code
 2. Report individual test cases success/failure
 3. Report relevant context in the case of failure (actual vs. expected)
 4. Don't create forks on every individual test case or assertion
 5. Properly handle crashes

Doing basically the following is not an ideal framework:

  echo '1..1'
  test-tool reftable > /dev/null 2>&1 && echo 'ok 1' || { echo 'not ok 1'; false; }

Yes, it works. But "it works" has never been a good enough reason to stay on
the status quo.

> We can write most or all of the test in C, but we just do so by
> calling a function that returns an exit code.

Yes, we *can*, but should we?

---

If we can test C code in a way that individual test case failures are reported
(as is the intention with TAP), why would we reject that in favor of the status
quo which basically is just reporting the exit code of the whole test file?

[RFC,0/2] add an external testing library for unit tests

Message

Comments