Message ID | 20211115115153.48307-1-dyroneteng@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | support `--oid-only` in `ls-tree` | expand |
On Mon, Nov 15 2021, Teng Long wrote: > Sometimes, we only want to get the objects from output of `ls-tree` > and commands like `sed` or `cut` is usually used to intercept the > origin output to achieve this purpose in practical. > > The patch contains three commits > > 1. Implementation of the option. > 2. Add new tests in "t3104". > 3. Documentation modifications. > > I'm appreciate if someone help to review the patch. I've looked it over, they look correct mostly, the test code in 2/3 looks a bit too complex (using find?). But I'd much rather see this be done with adding strbuf_expand() to ls-tree. I.e. its docs say that it can emit: <mode> SP <type> SP <object> TAB <file> Or, with -l: <mode> SP <type> SP <object> SP <object size> TAB <file> If you use strbuf_expand() you can just define a default format of: %(objectmode) SP %(objecttype) SP %(objectname) TAB %(path) Then make the existing -l option a shorthand for tweaking that to: %(objectmode) SP %(objecttype) SP %(objectsize) SP %(objectname) TAB %(path) Then you can get what you want out of this with a simple: git ls-tree --format="%(objectname)" See e.g. git-cat-file for an existing use of strbuf_expand().
On Mon, Nov 15, 2021 at 04:13:24PM +0100, Ævar Arnfjörð Bjarmason wrote: > But I'd much rather see this be done with adding strbuf_expand() to > ls-tree. I.e. its docs say that it can emit: I had a similar thought, but that's a much bigger task. I think it would be reasonable to add --oid-only to match the existing --name-only, etc. If we later add a custom --format option, then it can easily be folded in and explained as "this is an alias for --format=%(objectname)", just like --name-only would become "this is an alias for --format=%(path)". -Peff
On Mon, Nov 15, 2021 at 07:51:50PM +0800, Teng Long wrote: > Sometimes, we only want to get the objects from output of `ls-tree` > and commands like `sed` or `cut` is usually used to intercept the > origin output to achieve this purpose in practical. > > The patch contains three commits > > 1. Implementation of the option. > 2. Add new tests in "t3104". > 3. Documentation modifications. > > I'm appreciate if someone help to review the patch. This seems like a good feature to have. I think it would make sense to squash the three patches into a single one. The documentation and test patches do not stand on their own, which is why there was nothing useful to say in their commit messages. The implementation looks generally sensible (modulo the comments already given). I was surprised that there was not an existing ls-tree script that these would fit into. But there really isn't; t3101 covers --name-only and other output, but is really focused on the pathnames (though I think it would be OK to refactor it to cover output more generally). -Peff
On Mon, Nov 15 2021, Jeff King wrote: > On Mon, Nov 15, 2021 at 04:13:24PM +0100, Ævar Arnfjörð Bjarmason wrote: > >> But I'd much rather see this be done with adding strbuf_expand() to >> ls-tree. I.e. its docs say that it can emit: > > I had a similar thought, but that's a much bigger task. I think it would > be reasonable to add --oid-only to match the existing --name-only, etc. > If we later add a custom --format option, then it can easily be folded > in and explained as "this is an alias for --format=%(objectname)", just > like --name-only would become "this is an alias for --format=%(path)". A quick patch to do it below, seems to work, passes all tests, but I don't know how much I'd trust it. It's also quite an add use of strbuf_expa(). We print to stdout directly since write_name_quoted_relative() really wants to write to stdout, and not give you a buffer. But I guess it makes sense in a way. The hardcoded %7s for %(objectsize) is a bit nasty, but I don't know if we've got anything existing that handles format specifiers with strbuf_expand() that we could steal. I really wouldn't trust this code much, I found it when writing it that our tests for ls-tree are really lacking, e.g. we may not have a single test for "-l" anywhere (or maybe I didn't look enough, I was just running t/*ls*tree* while hacking it. I do thin that we should consider just going with --format in either case if we agree that this is a good direction. I.e. could just support 3-4 hardcoded formats now and die if anything else is specified. Then we'd be future-proof with the same interface expanding later, and wouldn't need to support options that we're only carrying because we didn't implement the more generic format support. (Assume my Signed-off-by, if there's any interest...) diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index 3a442631c71..e89daad4229 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -31,6 +31,20 @@ static const char * const ls_tree_usage[] = { NULL }; +static const char *ls_tree_format_d = "%(objectmode) %(objecttype) %(objectname) %(path)"; +static const char *ls_tree_format_l = "%(objectmode) %(objecttype) %(objectname) %(objectsize) %(path)"; +static const char *ls_tree_format_n = "%(path)"; + +struct expand_ls_tree_data { + const char *format; + unsigned mode; + const char *type; + const struct object_id *oid; + int abbrev; + const char *pathname; + const char *basebuf; +}; + static int show_recursive(const char *base, int baselen, const char *pathname) { int i; @@ -61,9 +75,69 @@ static int show_recursive(const char *base, int baselen, const char *pathname) return 0; } +static size_t expand_show_tree(struct strbuf *sb, + const char *start, + void *context) +{ + struct expand_ls_tree_data *data = context; + const char *end; + const char *p; + size_t len; + const char *type = blob_type; + + if (sb->len) { + fputs(sb->buf, stdout); + strbuf_reset(sb); + } + + if (*start != '(') + die(_("bad format as of '%s'"), start); + end = strchr(start + 1, ')'); + if (!end) + die(_("ls-tree format element '%s' does not end in ')'"), + start); + len = end - start + 1; + + if (skip_prefix(start, "(objectmode)", &p)) { + printf("%06o", data->mode); + } else if (skip_prefix(start, "(objecttype)", &p)) { + fputs(data->type, stdout); + } else if (skip_prefix(start, "(objectsize)", &p)) { + char size_text[24]; + const struct object_id *oid = data->oid; + + if (!strcmp(type, blob_type)) { + unsigned long size; + if (oid_object_info(the_repository, oid, &size) == OBJ_BAD) + xsnprintf(size_text, sizeof(size_text), + "BAD"); + else + xsnprintf(size_text, sizeof(size_text), + "%"PRIuMAX, (uintmax_t)size); + } else { + xsnprintf(size_text, sizeof(size_text), "-"); + } + printf("%7s", size_text); + } else if (skip_prefix(start, "(objectname)", &p)) { + fputs(find_unique_abbrev(data->oid, data->abbrev), stdout); + } else if (skip_prefix(start, "(path)", &p)) { + write_name_quoted_relative(data->basebuf, + chomp_prefix ? ls_tree_prefix : NULL, + stdout, line_termination); + + } else { + unsigned int errlen = (unsigned long)len; + die(_("bad ls-tree format specifiec %%%.*s"), errlen, start); + } + + return len; +} + static int show_tree(const struct object_id *oid, struct strbuf *base, const char *pathname, unsigned mode, void *context) { + struct expand_ls_tree_data *data = context; + struct strbuf sb = STRBUF_INIT; int retval = 0; int baselen; const char *type = blob_type; @@ -90,31 +164,18 @@ static int show_tree(const struct object_id *oid, struct strbuf *base, else if (ls_options & LS_TREE_ONLY) return 0; - if (!(ls_options & LS_NAME_ONLY)) { - if (ls_options & LS_SHOW_SIZE) { - char size_text[24]; - if (!strcmp(type, blob_type)) { - unsigned long size; - if (oid_object_info(the_repository, oid, &size) == OBJ_BAD) - xsnprintf(size_text, sizeof(size_text), - "BAD"); - else - xsnprintf(size_text, sizeof(size_text), - "%"PRIuMAX, (uintmax_t)size); - } else - xsnprintf(size_text, sizeof(size_text), "-"); - printf("%06o %s %s %7s\t", mode, type, - find_unique_abbrev(oid, abbrev), - size_text); - } else - printf("%06o %s %s\t", mode, type, - find_unique_abbrev(oid, abbrev)); - } baselen = base->len; strbuf_addstr(base, pathname); - write_name_quoted_relative(base->buf, - chomp_prefix ? ls_tree_prefix : NULL, - stdout, line_termination); + + strbuf_reset(&sb); + data->mode = mode; + data->type = type; + data->oid = oid; + data->abbrev = abbrev; + data->pathname = pathname; + data->basebuf = base->buf; + strbuf_expand(&sb, data->format, expand_show_tree, data); + strbuf_setlen(base, baselen); return retval; } @@ -147,6 +208,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) OPT__ABBREV(&abbrev), OPT_END() }; + struct expand_ls_tree_data ls_tree_cb_data = { + .format = ls_tree_format_d, + }; git_config(git_default_config, NULL); ls_tree_prefix = prefix; @@ -161,8 +225,14 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) } /* -d -r should imply -t, but -d by itself should not have to. */ if ( (LS_TREE_ONLY|LS_RECURSIVE) == - ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options)) + ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options)) { ls_options |= LS_SHOW_TREES; + } + if (ls_options & LS_NAME_ONLY) + ls_tree_cb_data.format = ls_tree_format_n; + + if (ls_options & LS_SHOW_SIZE) + ls_tree_cb_data.format = ls_tree_format_l; if (argc < 1) usage_with_options(ls_tree_usage, ls_tree_options); @@ -185,6 +255,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) tree = parse_tree_indirect(&oid); if (!tree) die("not a tree object"); + return !!read_tree(the_repository, tree, - &pathspec, show_tree, NULL); + &pathspec, show_tree, &ls_tree_cb_data); }
> A quick patch to do it below, seems to work, passes all tests, but I > don't know how much I'd trust it. It's also quite an add use of > strbuf_expa(). We print to stdout directly since > write_name_quoted_relative() really wants to write to stdout, and not > give you a buffer. But I guess it makes sense in a way. Thanks for the patch and the inputs about "strbuf_expa()". > Then we'd be future-proof with the same interface expanding later, and > wouldn't need to support options that we're only carrying because we > didn't implement the more generic format support. I agree but like Peff said it maybe another bigger task. I think I will firstly solve the existing problems in next patch. I will consider about the generic format support but not sure whether it will continue to iterate in this patchset. > (Assume my Signed-off-by, if there's any interest...) Of course I will. Thank you very much for your advice and guidance again.