Message ID | 9b3f6960-ea75-c3a7-3a24-0554320bb359@web.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | column: use utf8_strnwidth() to strip out ANSI color escapes | expand |
Hi René, On Sun, 13 Oct 2019, René Scharfe wrote: > Make use of utf8_strnwidth()'s feature to skip ANSI escape sequences > instead of open-coding it. This shortens the code and makes it more > consistent. Sounds good. > This changes the behavior, though: The old code skips all kinds of > Control Sequence Introducer sequences, while utf8_strnwidth() only skips > the Select Graphic Rendition kind, i.e. those ending with "m". They are > used for specifying color and font attributes like boldness. The only > other kind of escape sequence we print in Git is Erase in Line, ending > with "K". That's not used for columnar output, so this difference > actually doesn't matter here. Arguably, the "Erase in Line" thing should re-set the width to 0, no? But as you say, this is not needed for this patch. Thanks, Dscho > > Signed-off-by: René Scharfe <l.s.r@web.de> > --- > column.c | 13 +------------ > 1 file changed, 1 insertion(+), 12 deletions(-) > > diff --git a/column.c b/column.c > index 7a17c14b82..4a38eed322 100644 > --- a/column.c > +++ b/column.c > @@ -23,18 +23,7 @@ struct column_data { > /* return length of 's' in letters, ANSI escapes stripped */ > static int item_length(const char *s) > { > - int len, i = 0; > - struct strbuf str = STRBUF_INIT; > - > - strbuf_addstr(&str, s); > - while ((s = strstr(str.buf + i, "\033[")) != NULL) { > - int len = strspn(s + 2, "0123456789;"); > - i = s - str.buf; > - strbuf_remove(&str, i, len + 3); /* \033[<len><func char> */ > - } > - len = utf8_strwidth(str.buf); > - strbuf_release(&str); > - return len; > + return utf8_strnwidth(s, -1, 1); > } > > /* > -- > 2.23.0 >
Am 14.10.19 um 13:13 schrieb Johannes Schindelin: > Hi René, > > On Sun, 13 Oct 2019, René Scharfe wrote: > >> This changes the behavior, though: The old code skips all kinds of >> Control Sequence Introducer sequences, while utf8_strnwidth() only skips >> the Select Graphic Rendition kind, i.e. those ending with "m". They are >> used for specifying color and font attributes like boldness. The only >> other kind of escape sequence we print in Git is Erase in Line, ending >> with "K". That's not used for columnar output, so this difference >> actually doesn't matter here. > > Arguably, the "Erase in Line" thing should re-set the width to 0, no? > But as you say, this is not needed for this patch. It doesn't move the cursor, just clears the characters to the right, to the left or both sides, depending on its parameter. So ignoring it for width calculation like the old code did would be appropriate -- if we'd encounter such an escape sequence in text to be shown in columns. René
Hi René, On Mon, 14 Oct 2019, René Scharfe wrote: > Am 14.10.19 um 13:13 schrieb Johannes Schindelin: > > > On Sun, 13 Oct 2019, René Scharfe wrote: > > > >> This changes the behavior, though: The old code skips all kinds of > >> Control Sequence Introducer sequences, while utf8_strnwidth() only skips > >> the Select Graphic Rendition kind, i.e. those ending with "m". They are > >> used for specifying color and font attributes like boldness. The only > >> other kind of escape sequence we print in Git is Erase in Line, ending > >> with "K". That's not used for columnar output, so this difference > >> actually doesn't matter here. > > > > Arguably, the "Erase in Line" thing should re-set the width to 0, no? > > But as you say, this is not needed for this patch. > > It doesn't move the cursor, just clears the characters to the right, to > the left or both sides, depending on its parameter. So ignoring it for > width calculation like the old code did would be appropriate -- if we'd > encounter such an escape sequence in text to be shown in columns. Whoops, you're right. I brainfarted, mistaking it for `\r`... My bad! Ciao, Dscho
diff --git a/column.c b/column.c index 7a17c14b82..4a38eed322 100644 --- a/column.c +++ b/column.c @@ -23,18 +23,7 @@ struct column_data { /* return length of 's' in letters, ANSI escapes stripped */ static int item_length(const char *s) { - int len, i = 0; - struct strbuf str = STRBUF_INIT; - - strbuf_addstr(&str, s); - while ((s = strstr(str.buf + i, "\033[")) != NULL) { - int len = strspn(s + 2, "0123456789;"); - i = s - str.buf; - strbuf_remove(&str, i, len + 3); /* \033[<len><func char> */ - } - len = utf8_strwidth(str.buf); - strbuf_release(&str); - return len; + return utf8_strnwidth(s, -1, 1); } /*
Make use of utf8_strnwidth()'s feature to skip ANSI escape sequences instead of open-coding it. This shortens the code and makes it more consistent. This changes the behavior, though: The old code skips all kinds of Control Sequence Introducer sequences, while utf8_strnwidth() only skips the Select Graphic Rendition kind, i.e. those ending with "m". They are used for specifying color and font attributes like boldness. The only other kind of escape sequence we print in Git is Erase in Line, ending with "K". That's not used for columnar output, so this difference actually doesn't matter here. Signed-off-by: René Scharfe <l.s.r@web.de> --- column.c | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-) -- 2.23.0