Message ID | 20190607225900.89299-1-emilyshaffer@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | revision: remove stray whitespace when name empty | expand |
On Fri, Jun 7, 2019 at 6:59 PM Emily Shaffer <emilyshaffer@google.com> wrote: > Teach show_object_with_name() to avoid writing a space before a name > which is empty. Also teach tests for rev-list --objects --filter to not > require a space between the object ID and name. > [...] > Signed-off-by: Emily Shaffer <emilyshaffer@google.com> > --- > Not sure if making the whitespace optional is the right solution for the > test, although I couldn't come up with a cleaner approach. Maybe > something like this would be better, even though handling it in the > regex is shorter? > > if [[ -z "$name" ]] then In Git, we avoid Bash-isms. Just use 'test'. And, style is to place 'then' on its own line. if test -z "$name" then > diff --git a/revision.c b/revision.c > @@ -40,7 +40,8 @@ void show_object_with_name(FILE *out, struct object *obj, const char *name) > { > - fprintf(out, "%s ", oid_to_hex(&obj->oid)); > + fprintf(out, "%s%s", oid_to_hex(&obj->oid), > + strcmp(name, "") == 0 ? "" : " "); > for (p = name; *p && *p != '\n'; p++) > fputc(*p, out); > fputc('\n', out); It's subjective, but this starts to be a bit too noisy and unreadable. An alternative: fputs(oid_to_hex(...), out); if (*name) putc(' ', out); > diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh > @@ -288,7 +288,7 @@ expect_has () { > hash=$(git -C r3 rev-parse $commit:$name) && > - grep "^$hash $name$" actual > + grep "^$hash \?$name$" actual This would be the first use of \? with 'grep' in the test suite. I would worry about portability. Your suggestion earlier to tailor 'grep' invocation based upon whether $name is empty would be safer.
On Fri, Jun 07, 2019 at 03:59:00PM -0700, Emily Shaffer wrote: > Teach show_object_with_name() to avoid writing a space before a name > which is empty. Also teach tests for rev-list --objects --filter to not > require a space between the object ID and name. > [...] > --- > I don't see any reason _not_ to remove this stray whitespace at the end, > since it seems like it just gets in the way of easy scripting. I also > think this case will only present itself for root trees. I'm a bit worried that this might break existing scripts. As ugly as trailing whitespace is, it does tell you something here: that the object is a root tree and not a commit. So in the past I have done things like: git rev-list --objects --all | grep ' ' to get only the non-commits. I'm undecided on whether we're straying into https://xkcd.com/1172/ territory here. I'd be more in favor if this were making things significantly easier, but... > show_object_with_name() inserts a space between an object's OID and name > regardless of whether the name is empty or not. This causes 'git > cat-file (--batch | --batch-check)' to fail to discover the type of root > directories: > > git rev-list --objects --filter=tree:1 --max-count=1 HEAD \ > | git cat-file --batch-check > git rev-parse HEAD: | xargs -I% git cat-file -t % > git rev-list --objects --filter=tree:1 --max-count=1 HEAD \ > | xargs -I% echo "AA%AA" > git rev-list --objects --filter=tree:1 --max-count=1 HEAD \ > | cut -f 1 -d ' ' | git cat-file --batch-check Your patch only helps with this at all because you're using the "tree:1" filter. It would not help: git rev-list --objects HEAD | git cat-file --batch-check because there you'll have actual names which cat-file will choke on. So it seems like this is helping only a very limited use case. cat-file actually does know how to split on whitespace. Unfortunately it does not do so by default, because that breaks some cases. See 97be04077f (cat-file: only split on whitespace when %(rest) is used, 2013-08-02). So you _can_ do: git rev-list --objects HEAD | git cat-file --batch-check='%(objectname) %(objecttype) %(rest)' But: 1. That puts the %(rest) bits in your output, which you may not want. 2. You have to actually specify the full format, so you might have to repeat batch-check's default format items. I think it would be reasonable for cat-file to have an option to split on whitespace (and if not given explicitly by the user, default to the presence of %(rest) as we do now). Alternatively, it would be reasonable to me to have an option for "rev-list --objects" to have an option to suppress the filename (and space) entirely. I think in the longer run along those lines that "--objects" should allow cat-file style pretty-print formats, which would eliminate the need to pipe to cat-file in the first place. That makes this parsing problem go away entirely, and it's way more efficient to boot (rev-list already knows the types!). -Peff
Jeff King <peff@peff.net> writes: > Your patch only helps with this at all because you're using the "tree:1" > ... > because there you'll have actual names which cat-file will choke on. So > it seems like this is helping only a very limited use case. > ... > Alternatively, it would be reasonable to me to have an option for > "rev-list --objects" to have an option to suppress the filename (and > space) entirely. Yup, I think that is a more reasonable short-term change compared to the patch being discussed, and ... > I think in the longer run along those lines that "--objects" should > allow cat-file style pretty-print formats, which would eliminate the > need to pipe to cat-file in the first place. That makes this parsing > problem go away entirely, and it's way more efficient to boot (rev-list > already knows the types!). ... of course this makes tons of sense. Thanks.
On Fri, Jun 07, 2019 at 08:42:45PM -0400, Eric Sunshine wrote: > On Fri, Jun 7, 2019 at 6:59 PM Emily Shaffer <emilyshaffer@google.com> wrote: > > Teach show_object_with_name() to avoid writing a space before a name > > which is empty. Also teach tests for rev-list --objects --filter to not > > require a space between the object ID and name. > > [...] > > Signed-off-by: Emily Shaffer <emilyshaffer@google.com> > > --- > > Not sure if making the whitespace optional is the right solution for the > > test, although I couldn't come up with a cleaner approach. Maybe > > something like this would be better, even though handling it in the > > regex is shorter? > > > > if [[ -z "$name" ]] then > > In Git, we avoid Bash-isms. Just use 'test'. And, style is to place > 'then' on its own line. > > if test -z "$name" > then > > > diff --git a/revision.c b/revision.c > > @@ -40,7 +40,8 @@ void show_object_with_name(FILE *out, struct object *obj, const char *name) > > { > > - fprintf(out, "%s ", oid_to_hex(&obj->oid)); > > + fprintf(out, "%s%s", oid_to_hex(&obj->oid), > > + strcmp(name, "") == 0 ? "" : " "); > > for (p = name; *p && *p != '\n'; p++) > > fputc(*p, out); > > fputc('\n', out); > > It's subjective, but this starts to be a bit too noisy and unreadable. > An alternative: > > fputs(oid_to_hex(...), out); > if (*name) > putc(' ', out); > > > diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh > > @@ -288,7 +288,7 @@ expect_has () { > > hash=$(git -C r3 rev-parse $commit:$name) && > > - grep "^$hash $name$" actual > > + grep "^$hash \?$name$" actual > > This would be the first use of \? with 'grep' in the test suite. I > would worry about portability. Your suggestion earlier to tailor > 'grep' invocation based upon whether $name is empty would be safer. Thanks for the review, Eric. I'm going to try again from scratch with Peff and Junio's suggestion about adding an option to remove object names, and I'll try to keep your general style concerns in mind when I do so. - Emily
On Mon, Jun 10, 2019 at 09:29:14AM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > Your patch only helps with this at all because you're using the "tree:1" > > ... > > because there you'll have actual names which cat-file will choke on. So > > it seems like this is helping only a very limited use case. > > ... > > Alternatively, it would be reasonable to me to have an option for > > "rev-list --objects" to have an option to suppress the filename (and > > space) entirely. > > Yup, I think that is a more reasonable short-term change compared to > the patch being discussed, and ... I like this too. Starting work on it today. > > > I think in the longer run along those lines that "--objects" should > > allow cat-file style pretty-print formats, which would eliminate the > > need to pipe to cat-file in the first place. That makes this parsing > > problem go away entirely, and it's way more efficient to boot (rev-list > > already knows the types!). > > ... of course this makes tons of sense. Probably not going to start work on this now, but I will make a note to myself to have a look if I have some spare time soon, and keep an eye out for reviews in that area. > > Thanks. Thanks all for the feedback, I'll have a reroll soon. - Emily
On Wed, Jun 12, 2019 at 12:37:45PM -0700, Emily Shaffer wrote: > > > Alternatively, it would be reasonable to me to have an option for > > > "rev-list --objects" to have an option to suppress the filename (and > > > space) entirely. > > > > Yup, I think that is a more reasonable short-term change compared to > > the patch being discussed, and ... > > I like this too. Starting work on it today. Great! > > > I think in the longer run along those lines that "--objects" should > > > allow cat-file style pretty-print formats, which would eliminate the > > > need to pipe to cat-file in the first place. That makes this parsing > > > problem go away entirely, and it's way more efficient to boot (rev-list > > > already knows the types!). > > > > ... of course this makes tons of sense. > > Probably not going to start work on this now, but I will make a note to > myself to have a look if I have some spare time soon, and keep an eye > out for reviews in that area. That's fine. I suspect it will be quite involved, because the format placeholders you'd want for "objects" are going to be more like "cat-file" ones than like the existing pretty.c ones. So I think this really implies unifying those format systems, which is a long-term project that Olga has been working on. -Peff
diff --git a/revision.c b/revision.c index d4aaf0ef25..260258b371 100644 --- a/revision.c +++ b/revision.c @@ -40,7 +40,8 @@ void show_object_with_name(FILE *out, struct object *obj, const char *name) { const char *p; - fprintf(out, "%s ", oid_to_hex(&obj->oid)); + fprintf(out, "%s%s", oid_to_hex(&obj->oid), + strcmp(name, "") == 0 ? "" : " "); for (p = name; *p && *p != '\n'; p++) fputc(*p, out); fputc('\n', out); diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh index acd7f5ab80..e05faa7a67 100755 --- a/t/t6112-rev-list-filters-objects.sh +++ b/t/t6112-rev-list-filters-objects.sh @@ -288,7 +288,7 @@ expect_has () { name=$2 && hash=$(git -C r3 rev-parse $commit:$name) && - grep "^$hash $name$" actual + grep "^$hash \?$name$" actual } test_expect_success 'verify tree:1 includes root trees' '
Teach show_object_with_name() to avoid writing a space before a name which is empty. Also teach tests for rev-list --objects --filter to not require a space between the object ID and name. show_object_with_name() inserts a space between an object's OID and name regardless of whether the name is empty or not. This causes 'git cat-file (--batch | --batch-check)' to fail to discover the type of root directories: git rev-list --objects --filter=tree:1 --max-count=1 HEAD \ | git cat-file --batch-check git rev-parse HEAD: | xargs -I% git cat-file -t % git rev-list --objects --filter=tree:1 --max-count=1 HEAD \ | xargs -I% echo "AA%AA" git rev-list --objects --filter=tree:1 --max-count=1 HEAD \ | cut -f 1 -d ' ' | git cat-file --batch-check The extra space provided by 'show_object_with_name()' sticks to the output of 'git rev-list' even if it's piped into a file. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> --- I don't see any reason _not_ to remove this stray whitespace at the end, since it seems like it just gets in the way of easy scripting. I also think this case will only present itself for root trees. Not sure if making the whitespace optional is the right solution for the test, although I couldn't come up with a cleaner approach. Maybe something like this would be better, even though handling it in the regex is shorter? if [[ -z "$name" ]] then grep "^$hash" actual else grep "^$hash $name" actual fi - Emily revision.c | 3 ++- t/t6112-rev-list-filters-objects.sh | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-)