From patchwork Thu Jan 19 18:18:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Oakley X-Patchwork-Id: 13108497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68E3DC46467 for ; Thu, 19 Jan 2023 18:20:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229459AbjASSTq (ORCPT ); Thu, 19 Jan 2023 13:19:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45242 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229520AbjASSTl (ORCPT ); Thu, 19 Jan 2023 13:19:41 -0500 Received: from smtp-out-2.talktalk.net (smtp-out-2.talktalk.net [62.24.135.66]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BA3C86A8 for ; Thu, 19 Jan 2023 10:19:38 -0800 (PST) Received: from localhost.localdomain ([88.110.98.79]) by smtp.talktalk.net with SMTP id IZUipjyTdLVi2IZUipC7TI; Thu, 19 Jan 2023 18:18:36 +0000 X-Originating-IP: [88.110.98.79] X-Spam: 0 X-OAuthority: v=2.3 cv=H8GlPNQi c=1 sm=1 tr=0 a=qs8Jj6vsB7NiZ+3IlNxB6Q==:117 a=qs8Jj6vsB7NiZ+3IlNxB6Q==:17 a=MKtGQD3n3ToA:10 a=1oJP67jkp3AA:10 a=ldyaYNNxDcoA:10 a=7a1DmALN0UtMUJkjXpMA:9 From: Philip Oakley To: GitList , Junio C Hamano Cc: Taylor Blau , NSENGIYUMVA WILBERFORCE , self Subject: [PATCH v5 1/5] doc: pretty-formats: separate parameters from placeholders Date: Thu, 19 Jan 2023 18:18:23 +0000 Message-Id: <20230119181827.1319-2-philipoakley@iee.email> X-Mailer: git-send-email 2.39.1.windows.1 In-Reply-To: <20230119181827.1319-1-philipoakley@iee.email> References: <20221112143616.1429-1-philipoakley@iee.email> <20230119181827.1319-1-philipoakley@iee.email> MIME-Version: 1.0 X-CMAE-Envelope: MS4wfGTp0OUqD3NGPy3/1ba5RpzvzF0pUJsTRnGtJrNXAOebb88jZKzL8tJoeIccMntvl6Pg6HkLx1s3kCowat7FEpQbN85CobO5YByUIQHQnRXntpxuAym2 dk+wzCTKl+oE0DA7dylFuv7uovCJOkr+PMA7kIbVOxdCcuyxO2/+J8VGU700eqVpPDNGOI1f8En1ynj2+O/KawNK6YTsyg0tzGjmBgcD9AXen7pX3F8YGQy/ 5eEf8+yoJWOZQDxwMm+LWrZ5VEW3GWAzKIevySyYsPR4yMoTYWxDLlG+jm9M0suNGMpJFLV2k9BuSOp/TkokRg== Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Commit a57523428b4 (pretty: support padding placeholders, %< %> and %><, 2013-04-19) introduced columnated place holders. These placeholders can be confusing as they contain `<` and `>` characters as part of their placeholders adjacent to the `` parameters. Add spaces either side of the `` parameters in the title line. The code (strtol) will consume any spaces around the number values (assuming they are passed as a quoted string with spaces). Note that the spaces are optional. Subsequent commits will clarify other confusions. Signed-off-by: Philip Oakley --- Documentation/pretty-formats.txt | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt index 0b4c1c8d98..02bec23509 100644 --- a/Documentation/pretty-formats.txt +++ b/Documentation/pretty-formats.txt @@ -146,24 +146,27 @@ The placeholders are: '%m':: left (`<`), right (`>`) or boundary (`-`) mark '%w([[,[,]]])':: switch line wrapping, like the -w option of linkgit:git-shortlog[1]. -'%<([,trunc|ltrunc|mtrunc])':: make the next placeholder take at +'%<( [,trunc|ltrunc|mtrunc])':: make the next placeholder take at least N columns, padding spaces on the right if necessary. Optionally truncate at the beginning (ltrunc), the middle (mtrunc) or the end (trunc) if the output is longer than - N columns. Note that truncating + N columns. + Note 1: that truncating only works correctly with N >= 2. -'%<|()':: make the next placeholder take at least until Nth + Note 2: spaces around the N + values are optional. +'%<|( )':: make the next placeholder take at least until Nth columns, padding spaces on the right if necessary -'%>()', '%>|()':: similar to '%<()', '%<|()' respectively, +'%>( )', '%>|( )':: similar to '%<( )', '%<|( )' respectively, but padding spaces on the left -'%>>()', '%>>|()':: similar to '%>()', '%>|()' +'%>>( )', '%>>|( )':: similar to '%>( )', '%>|( )' respectively, except that if the next placeholder takes more spaces than given and there are spaces on its left, use those spaces -'%><()', '%><|()':: similar to '%<()', '%<|()' +'%><( )', '%><|( )':: similar to '%<( )', '%<|( )' respectively, but padding both sides (i.e. the text is centered) From patchwork Thu Jan 19 18:18:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Oakley X-Patchwork-Id: 13108498 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0FA5C6379F for ; Thu, 19 Jan 2023 18:20:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230093AbjASSUB (ORCPT ); Thu, 19 Jan 2023 13:20:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229608AbjASSTl (ORCPT ); Thu, 19 Jan 2023 13:19:41 -0500 Received: from smtp-out-2.talktalk.net (smtp-out-2.talktalk.net [62.24.135.66]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99E292B29B for ; Thu, 19 Jan 2023 10:19:38 -0800 (PST) Received: from localhost.localdomain ([88.110.98.79]) by smtp.talktalk.net with SMTP id IZUipjyTdLVi2IZUipC7TN; Thu, 19 Jan 2023 18:18:37 +0000 X-Originating-IP: [88.110.98.79] X-Spam: 0 X-OAuthority: v=2.3 cv=H8GlPNQi c=1 sm=1 tr=0 a=qs8Jj6vsB7NiZ+3IlNxB6Q==:117 a=qs8Jj6vsB7NiZ+3IlNxB6Q==:17 a=MKtGQD3n3ToA:10 a=1oJP67jkp3AA:10 a=ldyaYNNxDcoA:10 a=maobgtrrHKCnxS5SnIYA:9 From: Philip Oakley To: GitList , Junio C Hamano Cc: Taylor Blau , NSENGIYUMVA WILBERFORCE , self Subject: [PATCH v5 2/5] doc: pretty-formats: delineate `%<|(` parameter values Date: Thu, 19 Jan 2023 18:18:24 +0000 Message-Id: <20230119181827.1319-3-philipoakley@iee.email> X-Mailer: git-send-email 2.39.1.windows.1 In-Reply-To: <20230119181827.1319-1-philipoakley@iee.email> References: <20221112143616.1429-1-philipoakley@iee.email> <20230119181827.1319-1-philipoakley@iee.email> MIME-Version: 1.0 X-CMAE-Envelope: MS4wfKjmWDDidILLRDqZI4JfqPxXhx1+6DJpE85VAVVwD/HbvHJ9KCP5dEd4rs3ON1QiHRtNnO0KWF7JDsrMj8EIKCFRV60EB5PXpVJk71yxiBwbkthTJaQ0 vGbc5Jgpyg4zXpzDbBzic7zkMun3QwdKEHaHB3W1ysYb/+1BXRsGAIK6bkvSl+/W1uVZnuJnqwIh70TozmxCrNfLooxpBwAOHH0YY7piZJbc3ZFxf7aedb53 Jpdfl8lR9nlGMvIoM3CThr9JnXA/7CKMPphE1Stg1BgVA8X26m74YEYEiBqTKIXL+dmhuswhYjHErc9UWgdkhw== Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Commit a57523428b4 (pretty: support padding placeholders, %< %> and %><, 2013-04-19) introduced column width place holders. It also added separate column position `%<|(` placeholders for display screen based placement. Change the display screen parameter reference from 'N' to 'M' and corresponding descriptives to make the distinction clearer. Signed-off-by: Philip Oakley --- Documentation/pretty-formats.txt | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt index 02bec23509..8cc1072196 100644 --- a/Documentation/pretty-formats.txt +++ b/Documentation/pretty-formats.txt @@ -147,7 +147,7 @@ The placeholders are: '%w([[,[,]]])':: switch line wrapping, like the -w option of linkgit:git-shortlog[1]. '%<( [,trunc|ltrunc|mtrunc])':: make the next placeholder take at - least N columns, padding spaces on + least N column widths, padding spaces on the right if necessary. Optionally truncate at the beginning (ltrunc), the middle (mtrunc) or the end @@ -155,18 +155,18 @@ The placeholders are: N columns. Note 1: that truncating only works correctly with N >= 2. - Note 2: spaces around the N + Note 2: spaces around the N and M (see below) values are optional. -'%<|( )':: make the next placeholder take at least until Nth - columns, padding spaces on the right if necessary -'%>( )', '%>|( )':: similar to '%<( )', '%<|( )' respectively, +'%<|( )':: make the next placeholder take at least until Mth + display column, padding spaces on the right if necessary +'%>( )', '%>|( )':: similar to '%<( )', '%<|( )' respectively, but padding spaces on the left -'%>>( )', '%>>|( )':: similar to '%>( )', '%>|( )' +'%>>( )', '%>>|( )':: similar to '%>( )', '%>|( )' respectively, except that if the next placeholder takes more spaces than given and there are spaces on its left, use those spaces -'%><( )', '%><|( )':: similar to '%<( )', '%<|( )' +'%><( )', '%><|( )':: similar to '%<( )', '%<|( )' respectively, but padding both sides (i.e. the text is centered) From patchwork Thu Jan 19 18:18:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Oakley X-Patchwork-Id: 13108500 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF44CC678D4 for ; Thu, 19 Jan 2023 18:20:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230004AbjASSTw (ORCPT ); Thu, 19 Jan 2023 13:19:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229561AbjASSTl (ORCPT ); Thu, 19 Jan 2023 13:19:41 -0500 Received: from smtp-out-2.talktalk.net (smtp-out-2.talktalk.net [62.24.135.66]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 44EB94B8A8 for ; Thu, 19 Jan 2023 10:19:39 -0800 (PST) Received: from localhost.localdomain ([88.110.98.79]) by smtp.talktalk.net with SMTP id IZUipjyTdLVi2IZUjpC7TZ; Thu, 19 Jan 2023 18:18:37 +0000 X-Originating-IP: [88.110.98.79] X-Spam: 0 X-OAuthority: v=2.3 cv=H8GlPNQi c=1 sm=1 tr=0 a=qs8Jj6vsB7NiZ+3IlNxB6Q==:117 a=qs8Jj6vsB7NiZ+3IlNxB6Q==:17 a=MKtGQD3n3ToA:10 a=1oJP67jkp3AA:10 a=ldyaYNNxDcoA:10 a=UKw2cPBjIe2Ex6xPS1QA:9 From: Philip Oakley To: GitList , Junio C Hamano Cc: Taylor Blau , NSENGIYUMVA WILBERFORCE , self Subject: [PATCH v5 4/5] doc: pretty-formats describe use of ellipsis in truncation Date: Thu, 19 Jan 2023 18:18:26 +0000 Message-Id: <20230119181827.1319-5-philipoakley@iee.email> X-Mailer: git-send-email 2.39.1.windows.1 In-Reply-To: <20230119181827.1319-1-philipoakley@iee.email> References: <20221112143616.1429-1-philipoakley@iee.email> <20230119181827.1319-1-philipoakley@iee.email> MIME-Version: 1.0 X-CMAE-Envelope: MS4wfKjmWDDidILLRDqZI4JfqPxXhx1+6DJpE85VAVVwD/HbvHJ9KCP5dEd4rs3ON1QiHRtNnO0KWF7JDsrMj8EIKCFRV60EB5PXpVJk71yxiBwbkthTJaQ0 vGbc5Jgpyg4zXpzDbBzic7zkMun3QwdKEHaHB3W1ysYb/+1BXRsGAIK6bkvSl+/W1uVZnuJnqwIh70TozmxCrNfLooxpBwAOHH0YY7piZJbc3ZFxf7aedb53 Jpdfl8lR9nlGMvIoM3CThr9JnXA/7CKMPphE1Stg1BgVA8X26m74YEYEiBqTKIXL+dmhuswhYjHErc9UWgdkhw== Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Commit a7f01c6b4d (pretty: support truncating in %>, %< and %><, 2013-04-19) added the use of ellipsis when truncating placeholder values. Show our 'two dot' ellipsis, and examples for the left, middle and right truncation to avoid any confusion as to which end of the string is adjusted. (cf justification and sub-string). Signed-off-by: Philip Oakley --- Documentation/pretty-formats.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt index cbca60a196..e51f1e54e1 100644 --- a/Documentation/pretty-formats.txt +++ b/Documentation/pretty-formats.txt @@ -149,9 +149,9 @@ The placeholders are: '%<( [,trunc|ltrunc|mtrunc])':: make the next placeholder take at least N column widths, padding spaces on the right if necessary. Optionally - truncate at the beginning (ltrunc), - the middle (mtrunc) or the end - (trunc) if the output is longer than + truncate (with ellipsis '..') at the left (ltrunc) `..ft`, + the middle (mtrunc) `mi..le`, or the end + (trunc) `rig..`, if the output is longer than N columns. Note 1: that truncating only works correctly with N >= 2. From patchwork Thu Jan 19 18:18:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Oakley X-Patchwork-Id: 13108499 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3AC45C678D6 for ; Thu, 19 Jan 2023 18:20:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230123AbjASSUI (ORCPT ); Thu, 19 Jan 2023 13:20:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45250 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229609AbjASSTm (ORCPT ); Thu, 19 Jan 2023 13:19:42 -0500 Received: from smtp-out-2.talktalk.net (smtp-out-2.talktalk.net [62.24.135.66]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6448E5B99 for ; Thu, 19 Jan 2023 10:19:40 -0800 (PST) Received: from localhost.localdomain ([88.110.98.79]) by smtp.talktalk.net with SMTP id IZUipjyTdLVi2IZUjpC7Tg; Thu, 19 Jan 2023 18:18:37 +0000 X-Originating-IP: [88.110.98.79] X-Spam: 0 X-OAuthority: v=2.3 cv=H8GlPNQi c=1 sm=1 tr=0 a=qs8Jj6vsB7NiZ+3IlNxB6Q==:117 a=qs8Jj6vsB7NiZ+3IlNxB6Q==:17 a=MKtGQD3n3ToA:10 a=1oJP67jkp3AA:10 a=ldyaYNNxDcoA:10 a=VwQbUJbxAAAA:8 a=hO_LHwqjR-8dI1bbRI8A:9 a=vMvttVU5AH4A:10 a=AjGcO6oz07-iQ99wixmX:22 From: Philip Oakley To: GitList , Junio C Hamano Cc: Taylor Blau , NSENGIYUMVA WILBERFORCE , self Subject: [PATCH v5 5/5] doc: pretty-formats note wide char limitations, and add tests Date: Thu, 19 Jan 2023 18:18:27 +0000 Message-Id: <20230119181827.1319-6-philipoakley@iee.email> X-Mailer: git-send-email 2.39.1.windows.1 In-Reply-To: <20230119181827.1319-1-philipoakley@iee.email> References: <20221112143616.1429-1-philipoakley@iee.email> <20230119181827.1319-1-philipoakley@iee.email> MIME-Version: 1.0 X-CMAE-Envelope: MS4wfKjmWDDidILLRDqZI4JfqPxXhx1+6DJpE85VAVVwD/HbvHJ9KCP5dEd4rs3ON1QiHRtNnO0KWF7JDsrMj8EIKCFRV60EB5PXpVJk71yxiBwbkthTJaQ0 vGbc5Jgpyg4zXpzDbBzic7zkMun3QwdKEHaHB3W1ysYb/+1BXRsGAIK6bkvSl+/W1uVZnuJnqwIh70TozmxCrNfLooxpBwAOHH0YY7piZJbc3ZFxf7aedb53 Jpdfl8lR9nlGMvIoM3CThr9JnXA/7CKMPphE1Stg1BgVA8X26m74YEYEiBqTKIXL+dmhuswhYjHErc9UWgdkhw== Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The previous commits added clarifications to the column alignment placeholders, note that the spaces are optional around the parameters. Also, a proposed extension [1] to allow hard truncation (without ellipsis '..') highlighted that the existing code does not play well with wide characters, such as Asian fonts and emojis. For example, N wide characters take 2N columns so won't fit an odd number column width, causing misalignment somewhere. Further analysis also showed that decomposed characters, e.g. separate `a` + `umlaut` Unicode code-points may also be mis-counted, in some cases leaving multiple loose `umlauts` all combined together. Add some notes about these limitations, and add basic tests to demonstrate them. The chosen solution for the tests is to substitute any wide character that overlaps a splitting boundary for the unicode vertical ellipsis code point as a rare but 'obvious' substitution. An alternative could be the substitution with a single dot '.' which matches regular expression usage, and our two dot ellipsis, and further in scenarios where the bulk of the text is wide characters, would be obvious. In mainly 'ascii' scenarios a singleton emoji being substituted by a dot could be confusing. It is enough that the tests fail cleanly. The final choice for the substitute character can be deferred. [1] https://lore.kernel.org/git/20221030185614.3842-1-philipoakley@iee.email/ Signed-off-by: Philip Oakley --- Documentation/pretty-formats.txt | 5 +++++ t/t4205-log-pretty-formats.sh | 27 +++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt index e51f1e54e1..3b71334459 100644 --- a/Documentation/pretty-formats.txt +++ b/Documentation/pretty-formats.txt @@ -157,6 +157,11 @@ The placeholders are: only works correctly with N >= 2. Note 2: spaces around the N and M (see below) values are optional. + Note 3: Emojis and other wide characters + will take two display columns, which may + over-run column boundaries. + Note 4: decomposed character combining marks + may be misplaced at padding boundaries. '%<|( )':: make the next placeholder take at least until Mth display column, padding spaces on the right if necessary. Use negative M values for column positions measured diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh index 0404491d6e..2cba0e0c56 100755 --- a/t/t4205-log-pretty-formats.sh +++ b/t/t4205-log-pretty-formats.sh @@ -1018,4 +1018,31 @@ test_expect_success '%(describe:abbrev=...) vs git describe --abbrev=...' ' test_cmp expect actual ' +# pretty-formats note wide char limitations, and add tests +test_expect_failure 'wide and decomposed characters column counting' ' + +# from t/lib-unicode-nfc-nfd.sh hex values converted to octal + utf8_nfc=$(printf "\303\251") && # e acute combined. + utf8_nfd=$(printf "\145\314\201") && # e with a combining acute (i.e. decomposed) + utf8_emoji=$(printf "\360\237\221\250") && + +# replacement character when requesting a wide char fits in a single display colum. +# "half wide" alternative could be a plain ASCII dot `.` + utf8_vert_ell=$(printf "\342\213\256") && + +# use ${xxx} here! + nfc10="${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}" && + nfd10="${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}" && + emoji5="${utf8_emoji}${utf8_emoji}${utf8_emoji}${utf8_emoji}${utf8_emoji}" && +# emoji5 uses 10 display columns + + test_commit "abcdefghij" && + test_commit --no-tag "${nfc10}" && + test_commit --no-tag "${nfd10}" && + test_commit --no-tag "${emoji5}" && + printf "${utf8_emoji}..${utf8_emoji}${utf8_vert_ell}\n${utf8_nfd}..${utf8_nfd}${utf8_nfd}\n${utf8_nfc}..${utf8_nfc}${utf8_nfc}\na..ij\n" >expected && + git log --format="%<(5,mtrunc)%s" -4 >actual && + test_cmp expected actual +' + test_done