mbox series

[v2,00/10] make "mktag" use fsck_tag()

Message ID 20201126222257.5629-1-avarab@gmail.com (mailing list archive)
Headers show
Series make "mktag" use fsck_tag() | expand

Message

Ævar Arnfjörð Bjarmason Nov. 26, 2020, 10:22 p.m. UTC
Now a non-RFC. I went for the approach I suggested in
<87r1ognv4b.fsf@evledraar.gmail.com> of just having fsck_tag() able to
optionally tell us about its parsed tag/type, thus avoiding any need
for a custom parser in mktag.c. Hopefully I've addressed the rest of
the feedback, range-diff below.

Ævar Arnfjörð Bjarmason (10):
  mktag doc: say <hash> not <sha1>
  mktag: use default strbuf_read() hint
  mktag: reword write_object_file() error
  mktag: remove redundant braces in one-line body "if"
  mktag tests: don't needlessly use a subshell
  mktag tests: remove needless SHA-1 hardcoding
  mktag tests: improve verify_object() test coverage
  fsck: add new "extra" checks for "mktag"
  mktag: use fsck instead of custom verify_tag()
  mktag doc: update to explain why to use this

 Documentation/git-hash-object.txt |   4 +
 Documentation/git-mktag.txt       |  31 ++++-
 builtin/mktag.c                   | 202 ++++++++----------------------
 fsck.c                            |  47 ++++++-
 fsck.h                            |  11 ++
 t/t1006-cat-file.sh               |   2 +-
 t/t3800-mktag.sh                  | 117 ++++++++++++-----
 7 files changed, 222 insertions(+), 192 deletions(-)

Range-diff:
11:  a7a6524b2a !  1:  f46abb37df mktag docs: say <hash> not <sha1>
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    mktag docs: say <hash> not <sha1>
    +    mktag doc: say <hash> not <sha1>
     
         Change the "mktag" documentation to refer to the input hash as just
         "hash", not "sha1". This command has supported SHA-256 for a while
 1:  f6d9482c5b =  2:  1b4d9a5330 mktag: use default strbuf_read() hint
 2:  49be8dbdd3 =  3:  83f4af6013 mktag: reword write_object_file() error
 3:  868d424e65 =  4:  bca1484ed9 mktag: remove redundant braces in one-line body "if"
 4:  e57969ce5d =  5:  ac7c4097c9 mktag tests: don't needlessly use a subshell
 5:  8cf5d95b1e !  6:  5e076659e4 mktag tests: remove needless SHA-1 hardcoding
    @@ Commit message
     
         Some of these tests were failing for the wrong reasons. The first one
         being modified here would fail because the line starts with "xxxxxx"
    -    instead of "object", the rest of the line doesn't matter. Let's just
    -    put a valid hash on the rest of the line anyway to narrow the test
    -    down for just the s/object/xxxxxx/ case.
    +    instead of "object", the rest of the line doesn't matter.
    +
    +    Let's just put a valid hash on the rest of the line anyway to narrow
    +    the test down for just the s/object/xxxxxx/ case.
     
         The second one being modified here would fail under
         GIT_TEST_DEFAULT_HASH=sha256 because <some sha-1 length garbage> is an
    @@ Commit message
         garbage> when under SHA-256.
     
         This doesn't really matter since we should be able to trust other
    -    parts of the code to validate things in the 0-9a-f range, but let's do
    -    it for good measure.
    +    parts of the code to validate things in the 0-9a-f range, but let's
    +    keep it for good measure.
     
         There's a later test which tests an invalid SHA which looks like a
         valid one, to stress the "We refuse to tag something we can't
         verify[...]" logic in mktag.c.
     
         But here we're testing for a SHA-length string which contains
    -    characters outside of the /[0-9a-f]/i set. Let's just do that with a
    -    ROT13 invocation.
    -
    -    We could get really unlucky and switch to a future hash function that
    -    just happens to produce all [0-9] output for this particular input,
    -    but that's very unlikely.
    +    characters outside of the /[0-9a-f]/i set.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    @@ t/t3800-mktag.sh: EOF
     -#  3. object line SHA1 check
     +#  3. object line SHA check
      
    -+invalid_sha=$(echo $head | tr A-Za-z N-ZA-Mn-za-m)
      cat >tag.sig <<EOF
     -object zz9e9b33986b1c2670fff52c5067603117b3e895
    -+object $invalid_sha
    ++object $(echo $head | tr 0-9a-f z)
      type tag
      tag mytag
      tagger . <> 0 +0000
 6:  2899f9cda5 <  -:  ---------- mktag tests: improve verify_object() test coverage
 -:  ---------- >  7:  a048c3e640 mktag tests: improve verify_object() test coverage
 7:  70d1d877aa =  8:  dab44d3235 fsck: add new "extra" checks for "mktag"
 8:  d37d684135 !  9:  8ff853caee mktag: use fsck instead of custom verify_tag()
    @@ Metadata
      ## Commit message ##
         mktag: use fsck instead of custom verify_tag()
     
    -    TODO: This subtly breaks one check, see the last patch in this series.
    -
         Change the validation logic in "mktag" to use fsck's fsck_tag()
         instead of its own custom parser. Curiously the logic for both dates
         back to the same commit[1]. Let's unify them so we're not maintaining
    @@ Commit message
             Islands when it comes to timezone extravagance.
     
          B. fsck allows missing author names such as "tagger <email>", mktag
    -        wouldn't, but would allow e.g. "tagger <email>" (but not "tagger
    -        <email>"). Now we allow all of these.
    +        wouldn't, but would allow e.g. "tagger [2 spaces] <email>" (but
    +        not "tagger [1 space] <email>"). Now we allow all of these.
     
          C. Like B, but "mktag" disallowed spaces in the <email> part, fsck
             allows it.
    @@ builtin/mktag.c
      #include "object-store.h"
     +#include "fsck.h"
      
    - /*
    -  * A signature file has a very simple fixed format: four lines
    -@@ builtin/mktag.c: static int verify_tag(char *buffer, unsigned long size)
    - 	const char *object, *type_line, *tag_line, *tagger_line, *lb, *rb, *p;
    - 	size_t len;
    - 
    -+	/* verify_tag() will be removed in the next commit */
    -+	return 0;
    -+
    - 	if (size < 84)
    - 		return error("wanna fool me ? you obviously got the size wrong !");
    - 
    -@@ builtin/mktag.c: static int verify_tag(char *buffer, unsigned long size)
    - 	return 0;
    - }
    - 
    +-/*
    +- * A signature file has a very simple fixed format: four lines
    +- * of "object <sha1>" + "type <typename>" + "tag <tagname>" +
    +- * "tagger <committer>", followed by a blank line, a free-form tag
    +- * message and a signature block that git itself doesn't care about,
    +- * but that can be verified with gpg or similar.
    +- *
    +- * The first four lines are guaranteed to be at least 83 bytes:
    +- * "object <sha1>\n" is 48 bytes, "type tag\n" at 9 bytes is the
    +- * shortest possible type-line, "tag .\n" at 6 bytes is the shortest
    +- * single-character-tag line, and "tagger . <> 0 +0000\n" at 20 bytes is
    +- * the shortest possible tagger-line.
    +- */
    +-
    +-/*
    +- * We refuse to tag something we can't verify. Just because.
    +- */
    +-static int verify_object(const struct object_id *oid, const char *expected_type)
     +static int mktag_fsck_error_func(struct fsck_options *o,
     +				 const struct object_id *oid,
     +				 enum object_type object_type,
     +				 int msg_type, const char *message)
    -+{
    + {
    +-	int ret = -1;
    +-	enum object_type type;
    +-	unsigned long size;
    +-	void *buffer = read_object_file(oid, &type, &size);
    +-	const struct object_id *repl = lookup_replace_object(the_repository, oid);
    +-
    +-	if (buffer) {
    +-		if (type == type_from_string(expected_type)) {
    +-			ret = check_object_signature(the_repository, repl,
    +-						     buffer, size,
    +-						     expected_type);
    +-		}
    +-		free(buffer);
     +	switch (msg_type) {
     +	case FSCK_WARN:
     +	case FSCK_ERROR:
    @@ builtin/mktag.c: static int verify_tag(char *buffer, unsigned long size)
     +		 * like missing "tagger" lines are "only" warnings
     +		 * under fsck, we've always considered them an error.
     +		 */
    -+		fprintf_ln(stderr, "error: %s", message);
    ++		fprintf_ln(stderr, "error: tag input does not pass fsck: %s", message);
     +		return 1;
     +	default:
     +		BUG("%d (FSCK_IGNORE?) should never trigger this callback",
     +		    msg_type);
    -+	}
    -+}
    + 	}
    +-	return ret;
    + }
    + 
    +-static int verify_tag(char *buffer, unsigned long size)
    ++static int verify_object_in_tag(struct object_id *tagged_oid, int *tagged_type)
    + {
    +-	int typelen;
    +-	char type[20];
    +-	struct object_id oid;
    +-	const char *object, *type_line, *tag_line, *tagger_line, *lb, *rb, *p;
    +-	size_t len;
    +-
    +-	if (size < 84)
    +-		return error("wanna fool me ? you obviously got the size wrong !");
    +-
    +-	buffer[size] = 0;
    +-
    +-	/* Verify object line */
    +-	object = buffer;
    +-	if (memcmp(object, "object ", 7))
    +-		return error("char%d: does not start with \"object \"", 0);
    +-
    +-	if (parse_oid_hex(object + 7, &oid, &p))
    +-		return error("char%d: could not get SHA1 hash", 7);
    +-
    +-	/* Verify type line */
    +-	type_line = p + 1;
    +-	if (memcmp(type_line - 1, "\ntype ", 6))
    +-		return error("char%d: could not find \"\\ntype \"", 47);
    +-
    +-	/* Verify tag-line */
    +-	tag_line = strchr(type_line, '\n');
    +-	if (!tag_line)
    +-		return error("char%"PRIuMAX": could not find next \"\\n\"",
    +-				(uintmax_t) (type_line - buffer));
    +-	tag_line++;
    +-	if (memcmp(tag_line, "tag ", 4) || tag_line[4] == '\n')
    +-		return error("char%"PRIuMAX": no \"tag \" found",
    +-				(uintmax_t) (tag_line - buffer));
    +-
    +-	/* Get the actual type */
    +-	typelen = tag_line - type_line - strlen("type \n");
    +-	if (typelen >= sizeof(type))
    +-		return error("char%"PRIuMAX": type too long",
    +-				(uintmax_t) (type_line+5 - buffer));
    +-
    +-	memcpy(type, type_line+5, typelen);
    +-	type[typelen] = 0;
    +-
    +-	/* Verify that the object matches */
    +-	if (verify_object(&oid, type))
    +-		return error("char%d: could not verify object %s", 7, oid_to_hex(&oid));
    +-
    +-	/* Verify the tag-name: we don't allow control characters or spaces in it */
    +-	tag_line += 4;
    +-	for (;;) {
    +-		unsigned char c = *tag_line++;
    +-		if (c == '\n')
    +-			break;
    +-		if (c > ' ')
    +-			continue;
    +-		return error("char%"PRIuMAX": could not verify tag name",
    +-				(uintmax_t) (tag_line - buffer));
    +-	}
    +-
    +-	/* Verify the tagger line */
    +-	tagger_line = tag_line;
    +-
    +-	if (memcmp(tagger_line, "tagger ", 7))
    +-		return error("char%"PRIuMAX": could not find \"tagger \"",
    +-			(uintmax_t) (tagger_line - buffer));
    +-
    +-	/*
    +-	 * Check for correct form for name and email
    +-	 * i.e. " <" followed by "> " on _this_ line
    +-	 * No angle brackets within the name or email address fields.
    +-	 * No spaces within the email address field.
    +-	 */
    +-	tagger_line += 7;
    +-	if (!(lb = strstr(tagger_line, " <")) || !(rb = strstr(lb+2, "> ")) ||
    +-		strpbrk(tagger_line, "<>\n") != lb+1 ||
    +-		strpbrk(lb+2, "><\n ") != rb)
    +-		return error("char%"PRIuMAX": malformed tagger field",
    +-			(uintmax_t) (tagger_line - buffer));
    +-
    +-	/* Check for author name, at least one character, space is acceptable */
    +-	if (lb == tagger_line)
    +-		return error("char%"PRIuMAX": missing tagger name",
    +-			(uintmax_t) (tagger_line - buffer));
    +-
    +-	/* timestamp, 1 or more digits followed by space */
    +-	tagger_line = rb + 2;
    +-	if (!(len = strspn(tagger_line, "0123456789")))
    +-		return error("char%"PRIuMAX": missing tag timestamp",
    +-			(uintmax_t) (tagger_line - buffer));
    +-	tagger_line += len;
    +-	if (*tagger_line != ' ')
    +-		return error("char%"PRIuMAX": malformed tag timestamp",
    +-			(uintmax_t) (tagger_line - buffer));
    +-	tagger_line++;
    +-
    +-	/* timezone, 5 digits [+-]hhmm, max. 1400 */
    +-	if (!((tagger_line[0] == '+' || tagger_line[0] == '-') &&
    +-	      strspn(tagger_line+1, "0123456789") == 4 &&
    +-	      tagger_line[5] == '\n' && atoi(tagger_line+1) <= 1400))
    +-		return error("char%"PRIuMAX": malformed tag timezone",
    +-			(uintmax_t) (tagger_line - buffer));
    +-	tagger_line += 6;
    +-
    +-	/* Verify the blank line separating the header from the body */
    +-	if (*tagger_line != '\n')
    +-		return error("char%"PRIuMAX": trailing garbage in tag header",
    +-			(uintmax_t) (tagger_line - buffer));
    ++	int ret;
    ++	enum object_type type;
    ++	unsigned long size;
    ++	void *buffer;
    ++	const struct object_id *repl;
    ++
    ++	buffer = read_object_file(tagged_oid, &type, &size);
    ++	if (!buffer)
    ++		die("could not read tagged object '%s'\n",
    ++		    oid_to_hex(tagged_oid));
    ++	if (type != *tagged_type)
    ++		die("object '%s' tagged as '%s', but is a '%s' type\n",
    ++		    oid_to_hex(tagged_oid),
    ++		    type_name(*tagged_type), type_name(type));
     +
    ++	repl = lookup_replace_object(the_repository, tagged_oid);
    ++	ret = check_object_signature(the_repository, repl,
    ++				     buffer, size, type_name(*tagged_type));
    ++	free(buffer);
    + 
    +-	/* The actual stuff afterwards we don't care about.. */
    +-	return 0;
    ++	return ret;
    + }
    + 
      int cmd_mktag(int argc, const char **argv, const char *prefix)
      {
    -+	struct object obj;
      	struct strbuf buf = STRBUF_INIT;
    - 	struct object_id result;
     +	struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
    ++	struct object_id tagged_oid;
    ++	int tagged_type;
    + 	struct object_id result;
      
      	if (argc != 1)
    - 		usage("git mktag");
     @@ builtin/mktag.c: int cmd_mktag(int argc, const char **argv, const char *prefix)
      	if (strbuf_read(&buf, 0, 0) < 0)
      		die_errno("could not read from stdin");
    @@ builtin/mktag.c: int cmd_mktag(int argc, const char **argv, const char *prefix)
     -	   "object <sha1>\ntype\ntagger " */
     -	if (verify_tag(buf.buf, buf.len) < 0)
     -		die("invalid tag signature file");
    -+	/* verify_tag() will be removed in the next commit */
    -+	verify_tag("", 0);
    -+
    -+	/*
    -+	 * Fake up an object for fsck_object()
    -+	 */
    -+	obj.parsed = 1;
    -+	obj.type = OBJ_TAG;
    -+
     +	fsck_options.extra = 1;
     +	fsck_options.error_func = mktag_fsck_error_func;
    -+	if (fsck_object(&obj, buf.buf, buf.len, &fsck_options))
    ++	if (fsck_tag_standalone(NULL, buf.buf, buf.len, &fsck_options,
    ++				&tagged_oid, &tagged_type))
     +		die("tag on stdin did not pass our strict fsck check");
    ++
    ++	if (verify_object_in_tag(&tagged_oid, &tagged_type))
    ++		die("tag on stdin did not refer to a valid object");
      
      	if (write_object_file(buf.buf, buf.len, tag_type, &result) < 0)
      		die("unable to write annotated tag object");
     
    + ## fsck.c ##
    +@@ fsck.c: static int fsck_tag(const struct object_id *oid, const char *buffer,
    + 		    unsigned long size, struct fsck_options *options)
    + {
    + 	struct object_id tagged_oid;
    ++	int tagged_type;
    ++	return fsck_tag_standalone(oid, buffer, size, options, &tagged_oid,
    ++				   &tagged_type);
    ++}
    ++
    ++int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
    ++			unsigned long size, struct fsck_options *options,
    ++			struct object_id *tagged_oid,
    ++			int *tagged_type)
    ++{
    + 	int ret = 0;
    + 	char *eol;
    + 	struct strbuf sb = STRBUF_INIT;
    +@@ fsck.c: static int fsck_tag(const struct object_id *oid, const char *buffer,
    + 		ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line");
    + 		goto done;
    + 	}
    +-	if (parse_oid_hex(buffer, &tagged_oid, &p) || *p != '\n') {
    ++	if (parse_oid_hex(buffer, tagged_oid, &p) || *p != '\n') {
    + 		ret = report(options, oid, OBJ_TAG, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
    + 		if (ret)
    + 			goto done;
    +@@ fsck.c: static int fsck_tag(const struct object_id *oid, const char *buffer,
    + 		ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line");
    + 		goto done;
    + 	}
    +-	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
    ++	*tagged_type = type_from_string_gently(buffer, eol - buffer, 1);
    ++	if (*tagged_type < 0)
    + 		ret = report(options, oid, OBJ_TAG, FSCK_MSG_BAD_TYPE, "invalid 'type' value");
    + 	if (ret)
    + 		goto done;
    +
    + ## fsck.h ##
    +@@ fsck.h: int fsck_walk(struct object *obj, void *data, struct fsck_options *options);
    + int fsck_object(struct object *obj, void *data, unsigned long size,
    + 	struct fsck_options *options);
    + 
    ++/*
    ++ * fsck a tag, and pass info about it back to the caller. This is
    ++ * exposed fsck_object() internals for git-mktag(1).
    ++ */
    ++int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
    ++			unsigned long size, struct fsck_options *options,
    ++			struct object_id *tagged_oid,
    ++			int *tag_type);
    ++
    + /*
    +  * Some fsck checks are context-dependent, and may end up queued; run this
    +  * after completing all fsck_object() calls in order to resolve any remaining
    +
      ## t/t1006-cat-file.sh ##
     @@ t/t1006-cat-file.sh: tag_content="$tag_header_without_timestamp 0000000000 +0000
      
    @@ t/t3800-mktag.sh: too short for a tag
      
      check_verify_failure 'Tag object length check' \
     -	'^error: .*size wrong.*$'
    -+	'^error: missingObject:'
    ++	'^error:.* missingObject:'
      
      ############################################################
      #  2. object line label check
    @@ t/t3800-mktag.sh: tagger . <> 0 +0000
      EOF
      
     -check_verify_failure '"object" line label check' '^error: char0: .*"object "$'
    -+check_verify_failure '"object" line label check' '^error: missingObject:'
    ++check_verify_failure '"object" line label check' '^error:.* missingObject:'
      
      ############################################################
      #  3. object line SHA check
    @@ t/t3800-mktag.sh: tagger . <> 0 +0000
      EOF
      
     -check_verify_failure '"object" line SHA1 check' '^error: char7: .*SHA1 hash$'
    -+check_verify_failure '"object" line check' '^error: badObjectSha1:'
    ++check_verify_failure '"object" line check' '^error:.* badObjectSha1:'
      
      ############################################################
      #  4. type line label check
    @@ t/t3800-mktag.sh: tagger . <> 0 +0000
      EOF
      
     -check_verify_failure '"type" line label check' '^error: char.*: .*"\\ntype "$'
    -+check_verify_failure '"type" line label check' '^error: missingTypeEntry:'
    ++check_verify_failure '"type" line label check' '^error:.* missingTypeEntry:'
      
      ############################################################
      #  5. type line eol check
    @@ t/t3800-mktag.sh: check_verify_failure '"type" line label check' '^error: char.*
      printf "type tagsssssssssssssssssssssssssssssss" >>tag.sig
      
     -check_verify_failure '"type" line eol check' '^error: char.*: .*"\\n"$'
    -+check_verify_failure '"type" line eol check' '^error: unterminatedHeader:'
    ++check_verify_failure '"type" line eol check' '^error:.* unterminatedHeader:'
      
      ############################################################
      #  6. tag line label check #1
    @@ t/t3800-mktag.sh: tagger . <> 0 +0000
      
      check_verify_failure '"tag" line label check #1' \
     -	'^error: char.*: no "tag " found$'
    -+	'^error: missingTagEntry:'
    ++	'^error:.* missingTagEntry:'
      
      ############################################################
      #  7. tag line label check #2
    @@ t/t3800-mktag.sh: tag
      
      check_verify_failure '"tag" line label check #2' \
     -	'^error: char.*: no "tag " found$'
    -+	'^error: badType:'
    ++	'^error:.* badType:'
      
      ############################################################
      #  8. type line type-name length check
    @@ t/t3800-mktag.sh: tag mytag
      
      check_verify_failure '"type" line type-name length check' \
     -	'^error: char.*: type too long$'
    -+	'^error: badType:'
    ++	'^error:.* badType:'
      
      ############################################################
    - #  9. verify object (SHA1/type) check
    + #  9. verify object (SHA/type) check
     @@ t/t3800-mktag.sh: tagger . <> 0 +0000
      EOF
      
    - check_verify_failure 'verify object (SHA1/type) check' \
    + check_verify_failure 'verify object (SHA/type) check -- correct type, nonexisting object' \
     -	'^error: char7: could not verify object.*$'
    -+	'^error: badType:'
    ++	'^fatal: could not read tagged object'
    + 
    + cat >tag.sig <<EOF
    + object $head
    +@@ t/t3800-mktag.sh: tagger . <> 0 +0000
    + EOF
    + 
    + check_verify_failure 'verify object (SHA/type) check -- made-up type, nonexisting object' \
    +-	'^fatal: invalid object type'
    ++	'^error:.* badType:'
    + 
    + cat >tag.sig <<EOF
    + object $(test_oid deadbeef)
    +@@ t/t3800-mktag.sh: tagger . <> 0 +0000
    + EOF
    + 
    + check_verify_failure 'verify object (SHA/type) check -- incorrect type, valid object' \
    +-	'^error: char7: could not verify object.*$'
    ++	'^error:.* badType:'
    + 
    + cat >tag.sig <<EOF
    + object $head
    +@@ t/t3800-mktag.sh: tagger . <> 0 +0000
    + EOF
    + 
    + check_verify_failure 'verify object (SHA/type) check -- incorrect type, valid object' \
    +-	'^error: char7: could not verify object'
    ++	'^fatal: object.*tagged as.*tree.*but is.*commit'
      
      ############################################################
      # 10. verify tag-name check
    @@ t/t3800-mktag.sh: tagger . <> 0 +0000
      
      check_verify_failure 'verify tag-name check' \
     -	'^error: char.*: could not verify tag name$'
    -+	'^error: badTagName:'
    ++	'^error:.* badTagName:'
      
      ############################################################
      # 11. tagger line label check #1
    @@ t/t3800-mktag.sh: This is filler
      
      check_verify_failure '"tagger" line label check #1' \
     -	'^error: char.*: could not find "tagger "$'
    -+	'^error: missingTaggerEntry:'
    ++	'^error:.* missingTaggerEntry:'
      
      ############################################################
      # 12. tagger line label check #2
    @@ t/t3800-mktag.sh: This is filler
      
      check_verify_failure '"tagger" line label check #2' \
     -	'^error: char.*: could not find "tagger "$'
    -+	'^error: missingTaggerEntry:'
    ++	'^error:.* missingTaggerEntry:'
      
      ############################################################
     -# 13. disallow missing tag author name
    @@ t/t3800-mktag.sh: tagger T A Gger <
      
      check_verify_failure 'disallow malformed tagger' \
     -	'^error: char.*: malformed tagger field$'
    -+	'^error: badEmail:'
    ++	'^error:.* badEmail:'
      
      ############################################################
      # 15. allow empty tag email
    @@ t/t3800-mktag.sh: tagger T A Gger <tagger@example.com>__
      
      check_verify_failure 'disallow missing tag timestamp' \
     -	'^error: char.*: missing tag timestamp$'
    -+	'^error: badDate:'
    ++	'^error:.* badDate:'
      
      ############################################################
      # 18. detect invalid tag timestamp1
    @@ t/t3800-mktag.sh: tagger T A Gger <tagger@example.com> Tue Mar 25 15:47:44 2008
      
      check_verify_failure 'detect invalid tag timestamp1' \
     -	'^error: char.*: missing tag timestamp$'
    -+	'^error: badDate:'
    ++	'^error:.* badDate:'
      
      ############################################################
      # 19. detect invalid tag timestamp2
    @@ t/t3800-mktag.sh: tagger T A Gger <tagger@example.com> 2008-03-31T12:20:15-0500
      
      check_verify_failure 'detect invalid tag timestamp2' \
     -	'^error: char.*: malformed tag timestamp$'
    -+	'^error: badDate:'
    ++	'^error:.* badDate:'
      
      ############################################################
      # 20. detect invalid tag timezone1
    @@ t/t3800-mktag.sh: tagger T A Gger <tagger@example.com> 1206478233 GMT
      
      check_verify_failure 'detect invalid tag timezone1' \
     -	'^error: char.*: malformed tag timezone$'
    -+	'^error: badTimezone:'
    ++	'^error:.* badTimezone:'
      
      ############################################################
      # 21. detect invalid tag timezone2
    @@ t/t3800-mktag.sh: tagger T A Gger <tagger@example.com> 1206478233 +  30
      
      check_verify_failure 'detect invalid tag timezone2' \
     -	'^error: char.*: malformed tag timezone$'
    -+	'^error: badTimezone:'
    ++	'^error:.* badTimezone:'
      
      ############################################################
     -# 22. detect invalid tag timezone3
    @@ t/t3800-mktag.sh: this line should not be here
      
      check_verify_failure 'detect invalid header entry' \
     -	'^error: char.*: trailing garbage in tag header$'
    -+	'^error: extraHeaderEntry:'
    ++	'^error:.* extraHeaderEntry:'
     +
     +cat >tag.sig <<EOF
     +object $head
    @@ t/t3800-mktag.sh: this line should not be here
     +EOF
     +
     +check_verify_failure 'detect invalid header entry' \
    -+	'^error: extraHeaderBodyNewline:'
    ++	'^error:.* extraHeaderBodyNewline:'
      
      ############################################################
      # 24. create valid tag
 9:  403820698b <  -:  ---------- mktag: remove now-unused verify_tag() code
10:  78a4877450 = 10:  e38feefd3f mktag doc: update to explain why to use this
12:  0e840246a8 <  -:  ---------- mktag: bring back some of the verify_object() logic

Comments

Ævar Arnfjörð Bjarmason Dec. 1, 2020, 10:08 a.m. UTC | #1
On Thu, Nov 26 2020, Ævar Arnfjörð Bjarmason wrote:

> Now a non-RFC. I went for the approach I suggested in
> <87r1ognv4b.fsf@evledraar.gmail.com> of just having fsck_tag() able to
> optionally tell us about its parsed tag/type, thus avoiding any need
> for a custom parser in mktag.c. Hopefully I've addressed the rest of
> the feedback, range-diff below.

Ping @ Jeff & brian: you said you wanted this in one shape or another,
so mind seeing if the v2 looks good to you?:)

Junio didn't pick it up for the "What's Cooking" sent out recently,
hopefully some reviewer ACK/NACK will help move it forward. Thanks!
Junio C Hamano Dec. 1, 2020, 8:01 p.m. UTC | #2
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Thu, Nov 26 2020, Ævar Arnfjörð Bjarmason wrote:
>
>> Now a non-RFC. I went for the approach I suggested in
>> <87r1ognv4b.fsf@evledraar.gmail.com> of just having fsck_tag() able to
>> optionally tell us about its parsed tag/type, thus avoiding any need
>> for a custom parser in mktag.c. Hopefully I've addressed the rest of
>> the feedback, range-diff below.
>
> Ping @ Jeff & brian: you said you wanted this in one shape or another,
> so mind seeing if the v2 looks good to you?:)
>
> Junio didn't pick it up for the "What's Cooking" sent out recently,
> hopefully some reviewer ACK/NACK will help move it forward. Thanks!

True.  I don't want to queue too many topics on 'seen', only to end
up with a pile of patches that haven't been reviewed adequately and
cannot move forward.
Junio C Hamano Dec. 2, 2020, 10:20 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Thu, Nov 26 2020, Ævar Arnfjörð Bjarmason wrote:
>>
>>> Now a non-RFC. I went for the approach I suggested in
>>> <87r1ognv4b.fsf@evledraar.gmail.com> of just having fsck_tag() able to
>>> optionally tell us about its parsed tag/type, thus avoiding any need
>>> for a custom parser in mktag.c. Hopefully I've addressed the rest of
>>> the feedback, range-diff below.
>>
>> Ping @ Jeff & brian: you said you wanted this in one shape or another,
>> so mind seeing if the v2 looks good to you?:)
>>
>> Junio didn't pick it up for the "What's Cooking" sent out recently,
>> hopefully some reviewer ACK/NACK will help move it forward. Thanks!
>
> True.  I don't want to queue too many topics on 'seen', only to end
> up with a pile of patches that haven't been reviewed adequately and
> cannot move forward.

So, now I've seen all of them.

There were some minor things in the earlier part I commented on, and
if I am not mistaken, a new check, "body must not begin with a blank
line", should not be added at all, which would affect 08/10.

I am not sure how much longer we want to retain the .extra level of
checks that are more strict than those of fsck.  If we decide that
it is not worth it to forbid new object headers, we may be able to
lose 08/10 altogether.

Other than the above, the series mostly looked well done.

Thanks.
Ævar Arnfjörð Bjarmason Dec. 3, 2020, 4:38 p.m. UTC | #4
On Wed, Dec 02 2020, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>>> On Thu, Nov 26 2020, Ævar Arnfjörð Bjarmason wrote:
>>>
>>>> Now a non-RFC. I went for the approach I suggested in
>>>> <87r1ognv4b.fsf@evledraar.gmail.com> of just having fsck_tag() able to
>>>> optionally tell us about its parsed tag/type, thus avoiding any need
>>>> for a custom parser in mktag.c. Hopefully I've addressed the rest of
>>>> the feedback, range-diff below.
>>>
>>> Ping @ Jeff & brian: you said you wanted this in one shape or another,
>>> so mind seeing if the v2 looks good to you?:)
>>>
>>> Junio didn't pick it up for the "What's Cooking" sent out recently,
>>> hopefully some reviewer ACK/NACK will help move it forward. Thanks!
>>
>> True.  I don't want to queue too many topics on 'seen', only to end
>> up with a pile of patches that haven't been reviewed adequately and
>> cannot move forward.
>
> So, now I've seen all of them.
>
> There were some minor things in the earlier part I commented on, and
> if I am not mistaken, a new check, "body must not begin with a blank
> line", should not be added at all, which would affect 08/10.
>
> I am not sure how much longer we want to retain the .extra level of
> checks that are more strict than those of fsck.  If we decide that
> it is not worth it to forbid new object headers, we may be able to
> lose 08/10 altogether.
>
> Other than the above, the series mostly looked well done.

Hi. Just a quick note to say thanks for all the feedback, and that I
don't have a v3 ready now, but will submit one soon-ish.

FWIW the header-body newline thing wasn't something I was trying to
sneak in, I went wrong in my testing somewhere and thought it was a bug
under mktag. Will test for that, remove that check or whatever as
appropriate. Thanks for the review.