[v2,0/2] t/helper/test-tool: implement 'sha1-unsafe' helper

Message ID	cover.1730833506.git.me@ttaylorr.com (mailing list archive)
Headers	show Received: from mail-yb1-f169.google.com (mail-yb1-f169.google.com [209.85.219.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE1211E2007 for <git@vger.kernel.org>; Tue, 5 Nov 2024 19:05:13 +0000 (UTC) Date: Tue, 5 Nov 2024 14:05:10 -0500 From: Taylor Blau <me@ttaylorr.com> To: git@vger.kernel.org Cc: Elijah Newren <newren@gmail.com>, Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>, "brian m. carlson" <sandals@crustytoothpaste.net> Subject: [PATCH v2 0/2] t/helper/test-tool: implement 'sha1-unsafe' helper Message-ID: <cover.1730833506.git.me@ttaylorr.com> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline
Series	t/helper/test-tool: implement 'sha1-unsafe' helper \| expand [v2,0/2] t/helper/test-tool: implement 'sha1-unsafe' helper [v2,1/2] t/helper/test-sha1: prepare for an unsafe mode [v2,2/2] t/helper/test-tool: implement sha1-unsafe helper

Taylor Blau Nov. 5, 2024, 7:05 p.m. UTC

This series implements a new 'sha1-unsafe' test helper, similar to
't/helper/test-tool sha1'.

I have found such a helper to be really handy when debugging the new
SHA1_UNSAFE build knobs, e.g., to easily compare the performance of the
safe versus unsafe routines, different unsafe variants, etc.

The first patch prepares us by setting up the existing cmd_hash_impl()
function to be able to conditionally use the unsafe variant. The final
patch introduces a new 'sha1-unsafe' test helper which calls the new
variant.

Thanks in advance for your review!

Taylor Blau (2):
  t/helper/test-sha1: prepare for an unsafe mode
  t/helper/test-tool: implement sha1-unsafe helper

 t/helper/test-hash.c   | 17 +++++++++++++----
 t/helper/test-sha1.c   |  7 ++++++-
 t/helper/test-sha1.sh  | 38 ++++++++++++++++++++++----------------
 t/helper/test-sha256.c |  2 +-
 t/helper/test-tool.c   |  1 +
 t/helper/test-tool.h   |  3 ++-
 6 files changed, 45 insertions(+), 23 deletions(-)

Range-diff against v1:
1:  3b31db4d2df = 1:  0e2fcee6894 t/helper/test-sha1: prepare for an unsafe mode
2:  d343f5dc9e5 = 2:  d8c1fc78b57 t/helper/test-tool: implement sha1-unsafe helper

base-commit: 8f8d6eee531b3fa1a8ef14f169b0cb5035f7a772

Jeff King Nov. 7, 2024, 1:47 a.m. UTC | #1

On Tue, Nov 05, 2024 at 02:05:10PM -0500, Taylor Blau wrote:

> This series implements a new 'sha1-unsafe' test helper, similar to
> 't/helper/test-tool sha1'.
> 
> I have found such a helper to be really handy when debugging the new
> SHA1_UNSAFE build knobs, e.g., to easily compare the performance of the
> safe versus unsafe routines, different unsafe variants, etc.
> 
> The first patch prepares us by setting up the existing cmd_hash_impl()
> function to be able to conditionally use the unsafe variant. The final
> patch introduces a new 'sha1-unsafe' test helper which calls the new
> variant.

I think this is a useful thing to have, and I didn't see anything wrong
in the implementation. I did notice some oddities that existed before
your series:

  1. Why do we have "test-tool sha256" at all? Nobody in the test suite
     calls it. It feels like the whole test-sha1/sha256/hash split is
     overly complicated. A single "test-tool hash" seems like it would
     be simpler, and it could take an "--algo" parameter (and an
     "--unsafe" one). I guess in the end we end up with the same options
     ,but the proliferation of top-level test-tool commands seems ugly
     to me (likewise "sha1_is_sha1dc").

  2. You modified test-sha1.sh, but I've wondered if we should just
     delete that script. It is not ever invoked in the test suite AFAIK.
     If we want correctness tests, they should go into a real t[0-9]*.sh
     script (though one imagines we exercise the sha1 code quite a lot
     in the rest of the tests). And it starts with a weird ad-hoc
     performance test that doesn't really tell us much. A t/perf test
     would be better (if it is even worth measuring at all).

So I dunno. None of those are the fault of your series, but it is piling
on yet another test-tool command.

-Peff

brian m. carlson Nov. 7, 2024, 2:05 a.m. UTC | #2

On 2024-11-07 at 01:47:37, Jeff King wrote:
> I think this is a useful thing to have, and I didn't see anything wrong
> in the implementation. I did notice some oddities that existed before
> your series:
> 
>   1. Why do we have "test-tool sha256" at all? Nobody in the test suite
>      calls it. It feels like the whole test-sha1/sha256/hash split is
>      overly complicated. A single "test-tool hash" seems like it would
>      be simpler, and it could take an "--algo" parameter (and an
>      "--unsafe" one). I guess in the end we end up with the same options
>      ,but the proliferation of top-level test-tool commands seems ugly
>      to me (likewise "sha1_is_sha1dc").

I think we do in `pack_trailer` in `t/lib-pack.sh`, but not in a
greppable way:

  # Compute and append pack trailer to "$1"
  pack_trailer () {
  	test-tool $(test_oid algo) -b <"$1" >trailer.tmp &&
  	cat trailer.tmp >>"$1" &&
  	rm -f trailer.tmp
  }

When you posed the question above, I wasn't sure why I added this
functionality: was it just to test my SHA-256 implementation adequately
or did it have some actual utility in the testsuite?  However, I knew if
it didn't appear straightforwardly in `git grep`, any uses would involve
`test_oid`, and boom, I was right.

I think a single helper with `--algo` and `--unsafe` parameters would
also be fine and would probably be a little more tidy, as you mention.

Taylor Blau Nov. 7, 2024, 9:33 p.m. UTC | #3

On Thu, Nov 07, 2024 at 02:05:26AM +0000, brian m. carlson wrote:
> On 2024-11-07 at 01:47:37, Jeff King wrote:
> > I think this is a useful thing to have, and I didn't see anything wrong
> > in the implementation. I did notice some oddities that existed before
> > your series:
> >
> >   1. Why do we have "test-tool sha256" at all? Nobody in the test suite
> >      calls it. It feels like the whole test-sha1/sha256/hash split is
> >      overly complicated. A single "test-tool hash" seems like it would
> >      be simpler, and it could take an "--algo" parameter (and an
> >      "--unsafe" one). I guess in the end we end up with the same options
> >      ,but the proliferation of top-level test-tool commands seems ugly
> >      to me (likewise "sha1_is_sha1dc").
>
> I think we do in `pack_trailer` in `t/lib-pack.sh`, but not in a
> greppable way:
>
>   # Compute and append pack trailer to "$1"
>   pack_trailer () {
>   	test-tool $(test_oid algo) -b <"$1" >trailer.tmp &&
>   	cat trailer.tmp >>"$1" &&
>   	rm -f trailer.tmp
>   }

Nice find. I think that it may be worth writing this as:

    case "$(test_oid algo)" in
    sha1)
        test-tool sha1 -b <"$1" >trailer.tmp
        ;;
    sha256)
        test-tool sha256 -b <"$1" >trailer.tmp
        ;;
    *)
        echo >&2 "unknown algorithm: $(test_oid algo)"
        exit 1
        ;;
    esac

To make it more greppable. Obviously the existing implementation is not
wrong, but I do find it remarkably hard to search for ;-).

> I think a single helper with `--algo` and `--unsafe` parameters would
> also be fine and would probably be a little more tidy, as you mention.

That would be nice too.

Thanks,
Taylor

Taylor Blau Nov. 7, 2024, 9:36 p.m. UTC | #4

On Wed, Nov 06, 2024 at 08:47:37PM -0500, Jeff King wrote:
>   2. You modified test-sha1.sh, but I've wondered if we should just
>      delete that script. It is not ever invoked in the test suite AFAIK.
>      If we want correctness tests, they should go into a real t[0-9]*.sh
>      script (though one imagines we exercise the sha1 code quite a lot
>      in the rest of the tests). And it starts with a weird ad-hoc
>      performance test that doesn't really tell us much. A t/perf test
>      would be better (if it is even worth measuring at all).

Yeah, I was sort of puzzled and thinking the same thing as I wrote the
patch.

I wouldn't be opposed to deleting it in the future, and certainly have
no strong feelings about keeping it around in the meantime. It just
seemed like the path of least resistance to bring it along for the
sha1-unsafe ride instead of deleting it outright.

> So I dunno. None of those are the fault of your series, but it is piling
> on yet another test-tool command.

Yeah, I think there was a fair amount of interesting discussion about
possible alternatives in this thread, which I am appreciative of.

I think that if nobody has strong feelings or issues with the current
implementation to add the sha1-unsafe test-tool, that we should do so
and take these patches.

In the future we can consider other things on top, like dropping the
test-sha1.sh script, having an unsafe pointer embedded within
the_hash_algo, or something else entirely. But those can be done on top,
or not at all, and I don't want to hold up this series for them.

Thanks,
Taylor

Jeff King Nov. 8, 2024, 5:22 p.m. UTC | #5

On Thu, Nov 07, 2024 at 02:05:26AM +0000, brian m. carlson wrote:

> >   1. Why do we have "test-tool sha256" at all? Nobody in the test suite
> >      calls it. It feels like the whole test-sha1/sha256/hash split is
> >      overly complicated. A single "test-tool hash" seems like it would
> >      be simpler, and it could take an "--algo" parameter (and an
> >      "--unsafe" one). I guess in the end we end up with the same options
> >      ,but the proliferation of top-level test-tool commands seems ugly
> >      to me (likewise "sha1_is_sha1dc").
> 
> I think we do in `pack_trailer` in `t/lib-pack.sh`, but not in a
> greppable way:
> 
>   # Compute and append pack trailer to "$1"
>   pack_trailer () {
>   	test-tool $(test_oid algo) -b <"$1" >trailer.tmp &&
>   	cat trailer.tmp >>"$1" &&
>   	rm -f trailer.tmp
>   }
> 
> When you posed the question above, I wasn't sure why I added this
> functionality: was it just to test my SHA-256 implementation adequately
> or did it have some actual utility in the testsuite?  However, I knew if
> it didn't appear straightforwardly in `git grep`, any uses would involve
> `test_oid`, and boom, I was right.

Ah, good catch. Or good recall, I suppose. ;)

It feels a tiny bit hacky that we have to specify the algo here. If
there were a plumbing tool to work with the trailing hash of a
csum-file, then naturally it would use the repo's algorithm instead of
having to specify it. And I have run into situations once or twice where
that might have been a useful debugging tool, rather than hacking
together shell tools like dd. But since we have already done that
hacking together in the test suite, and this is the only spot that has
needed it so far, it's probably worth letting sleeping dogs lie.

> I think a single helper with `--algo` and `--unsafe` parameters would
> also be fine and would probably be a little more tidy, as you mention.

Yup.

-Peff

Jeff King Nov. 8, 2024, 5:23 p.m. UTC | #6

On Thu, Nov 07, 2024 at 04:33:56PM -0500, Taylor Blau wrote:

> >   # Compute and append pack trailer to "$1"
> >   pack_trailer () {
> >   	test-tool $(test_oid algo) -b <"$1" >trailer.tmp &&
> >   	cat trailer.tmp >>"$1" &&
> >   	rm -f trailer.tmp
> >   }
> 
> Nice find. I think that it may be worth writing this as:
> 
>     case "$(test_oid algo)" in
>     sha1)
>         test-tool sha1 -b <"$1" >trailer.tmp
>         ;;
>     sha256)
>         test-tool sha256 -b <"$1" >trailer.tmp
>         ;;
>     *)
>         echo >&2 "unknown algorithm: $(test_oid algo)"
>         exit 1
>         ;;
>     esac

If it were "test-tool hash --algo=$(test_oid algo)", then presumably
test-tool would naturally do the same switch and error message
internally.

-Peff

Jeff King Nov. 8, 2024, 5:23 p.m. UTC | #7

On Thu, Nov 07, 2024 at 04:36:04PM -0500, Taylor Blau wrote:

> > So I dunno. None of those are the fault of your series, but it is piling
> > on yet another test-tool command.
> 
> Yeah, I think there was a fair amount of interesting discussion about
> possible alternatives in this thread, which I am appreciative of.
> 
> I think that if nobody has strong feelings or issues with the current
> implementation to add the sha1-unsafe test-tool, that we should do so
> and take these patches.

I'm OK with that direction.

> In the future we can consider other things on top, like dropping the
> test-sha1.sh script, having an unsafe pointer embedded within
> the_hash_algo, or something else entirely. But those can be done on top,
> or not at all, and I don't want to hold up this series for them.

Sounds good.

-Peff

[v2,0/2] t/helper/test-tool: implement 'sha1-unsafe' helper

Message

Comments