Message ID | 20190613215102.44627-1-emilyshaffer@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] rev-list: teach --oid-only to enable piping | expand |
On Thu, Jun 13, 2019 at 02:51:03PM -0700, Emily Shaffer wrote: > It didn't appear that using an existing --pretty string would do the > trick, as the formatting options for --pretty apply specifically to > commit objects (you can specify the commit hash and the tree hash, but > I didn't see a way to more generally pretty-print all types of objects). Right, it would be a big change to start custom-formatting the non-commits. I think this is a good step in the right direction. > rev-list doesn't appear to use parse-options-h, so I added a new option > as simply as I could see without breaking existing style; it seemed > wrong to add a flag to the `struct rev_list_info` as the definition of > struct and flag values alike are contained in bisect.h. There are a > couple other options to rev-list which are stored as globals. I think that's reasonable. This is definitely a rev-list option (not a revision.c one). The interaction between bisect.h and rev-list is a bit funny, but it's probably simpler not to try solving it here. > + if (arg_oid_only && revs.commit_format != CMIT_FMT_UNSPECIFIED) > + die(_("cannot combine --oid-only and --pretty or --header")); As you've implemented it here, I definitely agree that this conflicts with --pretty. But I think it also interacts badly with other options, like "--graph". TBH, I am not sure that "rev-list --graph --objects" is all that useful, because the non-commit objects are not prefixed with the graph lines. And without "--objects", this new option is not useful because the default is already to show only the oid. But I wonder if things would be simpler if we did not touch the commit code path at all. I.e., if this were simply "--no-object-names", and it touched only show_object(). And then you do not have to worry about funny interactions with commit formatting at all. It would be valid to do: git rev-list --pretty=raw --objects --no-object-names HEAD if you really wanted to (though I admit I do not have an immediate use for it). > @@ -255,6 +262,10 @@ static void show_object(struct object *obj, const char *name, void *cb_data) > display_progress(progress, ++progress_counter); > if (info->flags & REV_LIST_QUIET) > return; > + if (arg_oid_only) { > + printf("%s\n", oid_to_hex(&obj->oid)); > + return; > + } > show_object_with_name(stdout, obj, name); > } > A minor style point, but I think this might be easier to follow without the early return, since we are really choosing to do A or B. Writing: if (arg_oid_only) printf(...); else show_object_with_name(...); shows that more clearly, I think. > [...] Otherwise, this looks good, including the tests. One thing I did notice in the tests: > +test_expect_success 'rev-list --objects --oid-only is usable by cat-file' ' > + git rev-list --objects --oid-only --all >list-output && > + git cat-file --batch-check <list-output >cat-output && > + ! grep missing cat-output > +' Usually we prefer to look for the expected output, rather than making sure we did not find the unexpected. But I'm not sure if that might be burdensome in this case (i.e., if there's a bunch of cruft coming out of "rev-list" that would be annoying to match, and might even change as people add more tests). So I'm OK with it either way. -Peff
Jeff King <peff@peff.net> writes: > But I wonder if things would be simpler if we did not touch the commit > code path at all. I.e., if this were simply "--no-object-names", and it > touched only show_object(). Yeah, that sounds more tempting. And the refined code structure you suggested ... >> @@ -255,6 +262,10 @@ static void show_object(struct object *obj, const char *name, void *cb_data) >> display_progress(progress, ++progress_counter); >> if (info->flags & REV_LIST_QUIET) >> return; >> + if (arg_oid_only) { >> + printf("%s\n", oid_to_hex(&obj->oid)); >> + return; >> + } >> show_object_with_name(stdout, obj, name); >> } >> > > A minor style point, but I think this might be easier to follow without > the early return, since we are really choosing to do A or B. Writing: > > if (arg_oid_only) > printf(...); > else > show_object_with_name(...); > > shows that more clearly, I think. ... is a good way to clearly show that intention, I would think.
On Fri, Jun 14, 2019 at 01:25:59PM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > But I wonder if things would be simpler if we did not touch the commit > > code path at all. I.e., if this were simply "--no-object-names", and it > > touched only show_object(). > > Yeah, that sounds more tempting. And the refined code structure you > suggested ... > > >> @@ -255,6 +262,10 @@ static void show_object(struct object *obj, const char *name, void *cb_data) > >> display_progress(progress, ++progress_counter); > >> if (info->flags & REV_LIST_QUIET) > >> return; > >> + if (arg_oid_only) { > >> + printf("%s\n", oid_to_hex(&obj->oid)); > >> + return; > >> + } > >> show_object_with_name(stdout, obj, name); > >> } > >> > > > > A minor style point, but I think this might be easier to follow without > > the early return, since we are really choosing to do A or B. Writing: > > > > if (arg_oid_only) > > printf(...); > > else > > show_object_with_name(...); > > > > shows that more clearly, I think. > > ... is a good way to clearly show that intention, I would think. Sounds good. Thanks, both; I'll reroll that quickly today.
On Fri, Jun 14, 2019 at 12:07:28PM -0400, Jeff King wrote: > On Thu, Jun 13, 2019 at 02:51:03PM -0700, Emily Shaffer wrote: > > +test_expect_success 'rev-list --objects --oid-only is usable by cat-file' ' > > + git rev-list --objects --oid-only --all >list-output && > > + git cat-file --batch-check <list-output >cat-output && > > + ! grep missing cat-output > > +' > > Usually we prefer to look for the expected output, rather than making > sure we did not find the unexpected. But I'm not sure if that might be > burdensome in this case (i.e., if there's a bunch of cruft coming out of > "rev-list" that would be annoying to match, and might even change as > people add more tests). So I'm OK with it either way. My (newbie) opinion is that in this case, we specifically want to know that cat-file didn't choke on objects which we know exist (since they came from rev-list). I have the feeling that checking for the exact objects returned instead (or a sample of them) would be more brittle and would also make the wording of the test less direct. So if there's no complaint either way, I'd prefer to leave it the way it is. By the way, rev-list-misc.sh has a number of other existing "! grep ..." lines. - Emily
On Fri, Jun 14, 2019 at 04:29:46PM -0700, Emily Shaffer wrote: > On Fri, Jun 14, 2019 at 12:07:28PM -0400, Jeff King wrote: > > On Thu, Jun 13, 2019 at 02:51:03PM -0700, Emily Shaffer wrote: > > > > +test_expect_success 'rev-list --objects --oid-only is usable by cat-file' ' > > > + git rev-list --objects --oid-only --all >list-output && > > > + git cat-file --batch-check <list-output >cat-output && > > > + ! grep missing cat-output > > > +' > > > > Usually we prefer to look for the expected output, rather than making > > sure we did not find the unexpected. But I'm not sure if that might be > > burdensome in this case (i.e., if there's a bunch of cruft coming out of > > "rev-list" that would be annoying to match, and might even change as > > people add more tests). So I'm OK with it either way. > > My (newbie) opinion is that in this case, we specifically want to know > that cat-file didn't choke on objects which we know exist (since they > came from rev-list). I have the feeling that checking for the exact > objects returned instead (or a sample of them) would be more brittle and > would also make the wording of the test less direct. > > So if there's no complaint either way, I'd prefer to leave it the way it > is. Yeah, that's fine with me if it seems more clear to use grep here (and I was on the fence). > By the way, rev-list-misc.sh has a number of other existing "! grep ..." > lines. It never fails that when I complain about a style issue, the surrounding code is full of the same thing. ;) I'd have to look at each one to determine if they're sensible or not, and it's probably not worth anybody's time to do that cleanup at this point in time. -Peff
diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 9f31837d30..ff07ea9564 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -48,7 +48,7 @@ static const char rev_list_usage[] = " --children\n" " --objects | --objects-edge\n" " --unpacked\n" -" --header | --pretty\n" +" --header | --pretty | --oid-only\n" " --abbrev=<n> | --no-abbrev\n" " --abbrev-commit\n" " --left-right\n" @@ -75,6 +75,8 @@ enum missing_action { }; static enum missing_action arg_missing_action; +static int arg_oid_only; /* display only the oid of each object encountered */ + #define DEFAULT_OIDSET_SIZE (16*1024) static void finish_commit(struct commit *commit, void *data); @@ -90,6 +92,11 @@ static void show_commit(struct commit *commit, void *data) return; } + if (arg_oid_only) { + printf("%s\n", oid_to_hex(&commit->object.oid)); + return; + } + graph_show_commit(revs->graph); if (revs->count) { @@ -255,6 +262,10 @@ static void show_object(struct object *obj, const char *name, void *cb_data) display_progress(progress, ++progress_counter); if (info->flags & REV_LIST_QUIET) return; + if (arg_oid_only) { + printf("%s\n", oid_to_hex(&obj->oid)); + return; + } show_object_with_name(stdout, obj, name); } @@ -484,6 +495,11 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) if (skip_prefix(arg, "--missing=", &arg)) continue; /* already handled above */ + if (!strcmp(arg, ("--oid-only"))) { + arg_oid_only = 1; + continue; + } + usage(rev_list_usage); } @@ -499,6 +515,9 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) /* Only --header was specified */ revs.commit_format = CMIT_FMT_RAW; + if (arg_oid_only && revs.commit_format != CMIT_FMT_UNSPECIFIED) + die(_("cannot combine --oid-only and --pretty or --header")); + if ((!revs.commits && reflog_walk_empty(revs.reflog_info) && (!(revs.tag_objects || revs.tree_objects || revs.blob_objects) && !revs.pending.nr) && diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh index 0507999729..996ba1799b 100755 --- a/t/t6000-rev-list-misc.sh +++ b/t/t6000-rev-list-misc.sh @@ -48,6 +48,24 @@ test_expect_success 'rev-list --objects with pathspecs and copied files' ' ! grep one output ' +test_expect_success 'rev-list --objects --oid-only has no whitespace/names' ' + git rev-list --objects --oid-only HEAD >output && + ! grep wanted_file output && + ! grep unwanted_file output && + ! grep " " output +' + +test_expect_success 'rev-list --objects --oid-only is usable by cat-file' ' + git rev-list --objects --oid-only --all >list-output && + git cat-file --batch-check <list-output >cat-output && + ! grep missing cat-output +' + +test_expect_success 'rev-list --oid-only is incompatible with --pretty' ' + test_must_fail git rev-list --objects --oid-only --pretty HEAD && + test_must_fail git rev-list --objects --oid-only --header HEAD +' + test_expect_success 'rev-list A..B and rev-list ^A B are the same' ' git commit --allow-empty -m another && git tag -a -m "annotated" v1.0 &&
Allow easier parsing by cat-file by giving rev-list an option to print only the OID of an object without any additional information. This is a short-term shim; later on, rev-list should be taught how to print the types of objects it finds in a format similar to cat-file's. Before this commit, the output from rev-list needed to be massaged before being piped to cat-file, like so: git rev-list --objects HEAD | cut -f 1 -d ' ' \ | git cat-file --batch-check This was especially unexpected when dealing with root trees, as an invisible whitespace exists at the end of the OID: git rev-list --objects --filter=tree:1 --max-count=1 HEAD \ | xargs -I% echo "AA%AA" Now, it can be piped directly, as in the added test case: git rev-list --objects --oid-only HEAD | git cat-file --batch-check Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Change-Id: I489bdf0a8215532e540175188883ff7541d70e1b --- It didn't appear that using an existing --pretty string would do the trick, as the formatting options for --pretty apply specifically to commit objects (you can specify the commit hash and the tree hash, but I didn't see a way to more generally pretty-print all types of objects). rev-list doesn't appear to use parse-options-h, so I added a new option as simply as I could see without breaking existing style; it seemed wrong to add a flag to the `struct rev_list_info` as the definition of struct and flag values alike are contained in bisect.h. There are a couple other options to rev-list which are stored as globals. At the moment, this doesn't work with --abbrev, and although I think it'd be a good fit, right now --abbrev doesn't seem to work without --abbrev-commit, and both those options end up being eaten by setup_revisions() rather than by rev-list itself. - Emily builtin/rev-list.c | 21 ++++++++++++++++++++- t/t6000-rev-list-misc.sh | 18 ++++++++++++++++++ 2 files changed, 38 insertions(+), 1 deletion(-)