Message ID | 20200702140845.24945-3-chriscool@tuxfamily.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add support for %(contents:size) in ref-filter | expand |
Christian Couder <christian.couder@gmail.com> writes: > It's useful and efficient to be able to get the size of the > contents directly without having to pipe through `wc -c`. > > Also the result of the following: > > `git for-each-ref --format='%(contents)' | wc -c` > > is off by one as `git for-each-ref` appends a newline character > after the contents, which can be seen by comparing its ouput > with the output from `git cat-file`. So that's off by number of refs that are shown? > +contents:size:: > + The size in bytes of the complete message. > + Complete as opposed to what? What happens when the object referred to by the ref is not a commit or a tag? I am fine if it just is silently ignored (which is consistent with already existing behaviour of other requests that do not make sense for the given type) if the thing is a blob or a tree, but we'd need to cover the case with a test or two. It seems you only expect this with a tag object and do not have any test that checks for other types of objects? Thanks. > +# We cannot use test_atom to check contents:size of signed tags due to sanitize_pgp > +test_tag_contents_size_pgp () { > + ref="$1" > + test_expect_success $PREREQ "basic atom: $ref contents:size" " > + git cat-file tag $ref | tail -n +6 | wc -c >expected && > + git for-each-ref --format='%(contents:size)' $ref >actual && > + test_cmp expected actual > + " > +} > + > PREREQ=GPG > test_atom refs/tags/signed-empty subject '' > test_atom refs/tags/signed-empty contents:subject '' > @@ -629,6 +643,7 @@ test_atom refs/tags/signed-empty body "$sig" > test_atom refs/tags/signed-empty contents:body '' > test_atom refs/tags/signed-empty contents:signature "$sig" > test_atom refs/tags/signed-empty contents "$sig" > +test_tag_contents_size_pgp refs/tags/signed-empty > > test_atom refs/tags/signed-short subject 'subject line' > test_atom refs/tags/signed-short contents:subject 'subject line' > @@ -637,6 +652,7 @@ test_atom refs/tags/signed-short contents:body '' > test_atom refs/tags/signed-short contents:signature "$sig" > test_atom refs/tags/signed-short contents "subject line > $sig" > +test_tag_contents_size_pgp refs/tags/signed-short > > test_atom refs/tags/signed-long subject 'subject line' > test_atom refs/tags/signed-long contents:subject 'subject line' > @@ -649,6 +665,7 @@ test_atom refs/tags/signed-long contents "subject line > > body contents > $sig" > +test_tag_contents_size_pgp refs/tags/signed-long > > test_expect_success 'set up multiple-sort tags' ' > for when in 100000 200000
On Mon, Jul 6, 2020 at 11:44 PM Junio C Hamano <gitster@pobox.com> wrote: > > Christian Couder <christian.couder@gmail.com> writes: > > > It's useful and efficient to be able to get the size of the > > contents directly without having to pipe through `wc -c`. > > > > Also the result of the following: > > > > `git for-each-ref --format='%(contents)' | wc -c` > > > > is off by one as `git for-each-ref` appends a newline character > > after the contents, which can be seen by comparing its ouput > > with the output from `git cat-file`. > > So that's off by number of refs that are shown? Yeah, true. I should have added a ref as I mean something like the following is off by one: `git for-each-ref --format='%(contents)' refs/heads/my-branch | wc -c` I have changed it to the above in my current version. > > +contents:size:: > > + The size in bytes of the complete message. > > + > > Complete as opposed to what? In the existing documentation there is already "The complete message of a commit or tag object is `contents`. " So yeah I could add another preparatory patch to change the above to something like: "The complete message (subject, body, trailers and signature) of a commit or tag object is `contents`. " > What happens when the object referred to by the ref is not a commit > or a tag? Right now %(contents) shows nothing (a blank line actually) when the ref points to something other than a commit or a tag, and %(contents:size) does the same: ``` $ git update-ref refs/mytrees/first HEAD^{tree} $ git for-each-ref --format='%(contents)' refs/mytrees/first $ git for-each-ref --format='%(contents:size)' refs/mytrees/first ``` I am not sure if it's worth updating the existing documentation or just the commit message of this patch. For now I have done the latter in my current version. > I am fine if it just is silently ignored (which is consistent with > already existing behaviour of other requests that do not make sense > for the given type) if the thing is a blob or a tree, but we'd need > to cover the case with a test or two. It seems you only expect this > with a tag object and do not have any test that checks for other > types of objects? My patch already adds 1 test with a commit: --- a/t/t6300-for-each-ref.sh +++ b/t/t6300-for-each-ref.sh @@ -125,6 +125,7 @@ test_atom head contents:body '' test_atom head contents:signature '' test_atom head contents 'Initial ' +test_atom head contents:size '8' test_atom head HEAD '*' There is only one test with a commit, because that's already the case for %(contents) too. I am ok with adding another preparatory patch to the series that would add a few more test cases with commits, trees and blobs though.
Christian Couder <christian.couder@gmail.com> writes: >> I am fine if it just is silently ignored (which is consistent with >> already existing behaviour of other requests that do not make sense >> for the given type) if the thing is a blob or a tree, but we'd need >> to cover the case with a test or two. It seems you only expect this >> with a tag object and do not have any test that checks for other >> types of objects? > > My patch already adds 1 test with a commit: But that is not an interesting case, no? Unless I missed that there were new tests to see what happens with a blob and a tree, that is. > --- a/t/t6300-for-each-ref.sh > +++ b/t/t6300-for-each-ref.sh > @@ -125,6 +125,7 @@ test_atom head contents:body '' > test_atom head contents:signature '' > test_atom head contents 'Initial > ' > +test_atom head contents:size '8' > test_atom head HEAD '*' > > There is only one test with a commit, because that's already the case > for %(contents) too. > > I am ok with adding another preparatory patch to the series that would > add a few more test cases with commits, trees and blobs though. I am OK either way, because I know fairly well how these formatting atom system works (I had heavy involvement in the initial one) and I know it would be hard to make it do nonsensical things when given an object of an unexpected type. But I was hoping some among us experienced contributors would lead beginners by example of making sure we test both positive ("see, my shiny new toy does work") and negative ("and it won't do nonsensical things when given an input it is not designed to work with") cases. Thanks.
diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt index 2db9779d54..833641eacd 100644 --- a/Documentation/git-for-each-ref.txt +++ b/Documentation/git-for-each-ref.txt @@ -235,6 +235,9 @@ and `date` to extract the named component. The complete message of a commit or tag object is `contents`. This field can also be used in the following ways: +contents:size:: + The size in bytes of the complete message. + contents:subject:: The "subject" of the commit or tag message. It's actually the concatenation of all lines of the commit message up to the diff --git a/ref-filter.c b/ref-filter.c index bf7b70299b..036a95d0d2 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -127,7 +127,8 @@ static struct used_atom { unsigned int nobracket : 1, push : 1, push_remote : 1; } remote_ref; struct { - enum { C_BARE, C_BODY, C_BODY_DEP, C_LINES, C_SIG, C_SUB, C_TRAILERS } option; + enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH, + C_LINES, C_SIG, C_SUB, C_TRAILERS } option; struct process_trailer_options trailer_opts; unsigned int nlines; } contents; @@ -338,6 +339,8 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato atom->u.contents.option = C_BARE; else if (!strcmp(arg, "body")) atom->u.contents.option = C_BODY; + else if (!strcmp(arg, "size")) + atom->u.contents.option = C_LENGTH; else if (!strcmp(arg, "signature")) atom->u.contents.option = C_SIG; else if (!strcmp(arg, "subject")) @@ -1253,6 +1256,8 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf) v->s = copy_subject(subpos, sublen); else if (atom->u.contents.option == C_BODY_DEP) v->s = xmemdupz(bodypos, bodylen); + else if (atom->u.contents.option == C_LENGTH) + v->s = xstrfmt("%ld", strlen(subpos)); else if (atom->u.contents.option == C_BODY) v->s = xmemdupz(bodypos, nonsiglen); else if (atom->u.contents.option == C_SIG) diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh index da59fadc5d..ceee8d9372 100755 --- a/t/t6300-for-each-ref.sh +++ b/t/t6300-for-each-ref.sh @@ -125,6 +125,7 @@ test_atom head contents:body '' test_atom head contents:signature '' test_atom head contents 'Initial ' +test_atom head contents:size '8' test_atom head HEAD '*' test_atom tag refname refs/tags/testtag @@ -170,6 +171,7 @@ test_atom tag contents:body '' test_atom tag contents:signature '' test_atom tag contents 'Tagging at 1151968727 ' +test_atom tag contents:size '22' test_atom tag HEAD ' ' test_expect_success 'Check invalid atoms names are errors' ' @@ -580,6 +582,7 @@ test_atom refs/tags/subject-body contents 'the subject line first body line second body line ' +test_atom refs/tags/subject-body contents:size '51' test_expect_success 'create tag with multiline subject' ' cat >msg <<-\EOF && @@ -606,6 +609,7 @@ second subject line first body line second body line ' +test_atom refs/tags/multiline contents:size '73' test_expect_success GPG 'create signed tags' ' git tag -s -m "" signed-empty && @@ -622,6 +626,16 @@ sig='-----BEGIN PGP SIGNATURE----- -----END PGP SIGNATURE----- ' +# We cannot use test_atom to check contents:size of signed tags due to sanitize_pgp +test_tag_contents_size_pgp () { + ref="$1" + test_expect_success $PREREQ "basic atom: $ref contents:size" " + git cat-file tag $ref | tail -n +6 | wc -c >expected && + git for-each-ref --format='%(contents:size)' $ref >actual && + test_cmp expected actual + " +} + PREREQ=GPG test_atom refs/tags/signed-empty subject '' test_atom refs/tags/signed-empty contents:subject '' @@ -629,6 +643,7 @@ test_atom refs/tags/signed-empty body "$sig" test_atom refs/tags/signed-empty contents:body '' test_atom refs/tags/signed-empty contents:signature "$sig" test_atom refs/tags/signed-empty contents "$sig" +test_tag_contents_size_pgp refs/tags/signed-empty test_atom refs/tags/signed-short subject 'subject line' test_atom refs/tags/signed-short contents:subject 'subject line' @@ -637,6 +652,7 @@ test_atom refs/tags/signed-short contents:body '' test_atom refs/tags/signed-short contents:signature "$sig" test_atom refs/tags/signed-short contents "subject line $sig" +test_tag_contents_size_pgp refs/tags/signed-short test_atom refs/tags/signed-long subject 'subject line' test_atom refs/tags/signed-long contents:subject 'subject line' @@ -649,6 +665,7 @@ test_atom refs/tags/signed-long contents "subject line body contents $sig" +test_tag_contents_size_pgp refs/tags/signed-long test_expect_success 'set up multiple-sort tags' ' for when in 100000 200000
It's useful and efficient to be able to get the size of the contents directly without having to pipe through `wc -c`. Also the result of the following: `git for-each-ref --format='%(contents)' | wc -c` is off by one as `git for-each-ref` appends a newline character after the contents, which can be seen by comparing its ouput with the output from `git cat-file`. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> --- Documentation/git-for-each-ref.txt | 3 +++ ref-filter.c | 7 ++++++- t/t6300-for-each-ref.sh | 17 +++++++++++++++++ 3 files changed, 26 insertions(+), 1 deletion(-)