From patchwork Sun Aug 14 14:19:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?0L3QsNCx?= X-Patchwork-Id: 12942903 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA87CC25B06 for ; Sun, 14 Aug 2022 14:19:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231433AbiHNOTJ (ORCPT ); Sun, 14 Aug 2022 10:19:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231268AbiHNOTI (ORCPT ); Sun, 14 Aug 2022 10:19:08 -0400 Received: from tarta.nabijaczleweli.xyz (unknown [139.28.40.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AC141DED2 for ; Sun, 14 Aug 2022 07:19:05 -0700 (PDT) Received: from tarta.nabijaczleweli.xyz (unknown [192.168.1.250]) by tarta.nabijaczleweli.xyz (Postfix) with ESMTPSA id 90A371232; Sun, 14 Aug 2022 16:19:03 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=nabijaczleweli.xyz; s=202205; t=1660486743; bh=Zbt28bNoI5pN5Xg7J9S2Tlj/ctTHzXIQZrSNc9gOnrw=; h=Date:From:To:Subject:References:In-Reply-To:From; b=nXD2aUG0AhnvPxXTS+pWfB88xLhwVQpDEb5j3VXLwh7tHuV3PvI2RK0etCVvIzrmS BjHUBVdX1xhXXjmvwGJVJ83bvByw5wERUCWF7QXafi18UPTNpGTmfyRaf5CKkRq+zQ 4S+48EbmPMc2VW3Jvdq2PVHbBNSaBwEQY9OpRPLmH2B0t9hzsW2vHpZP/nGbehG0ev 5CWI2beKAuxiM4S1LkI3PFlF5Z0EThd5dHoRzEq80KhJBaodIZtDgcP0TCnXvpndsa OXwb4x8zN+arYE8wjlv750k8TjWwDbwAxIRUY2jUTbTZLyY686pnsFrrKXjBJhgCSt P0NDsn5ez7VqWYBp3mkQIw8/RnKTH57AwSpnlLMTFrKbAjhlPXD6Cg1GbISw4/KS6x j2S3D6IlkGP1qfvSTG3TVhYnmw8w21YbPlyQyD9nZpygqHXojEDooPFDv5xg+obsRH ToHYS8kactphY4Zfl6n66i81mt1S+lG5HGVVziUaL1HfmHZOg4hChJEdyVdIHgKPet eusoG5GoeUqubHa+yiqsaBfmr9PLtWi3abEHrEhYFxCasq0Vei6GJG+4ZqBBydc9bc +qUmrLwAA3JrRK18bLdWlk/FwwkCyAn2zsKjMfj0HMr1Usp1/ucUM8VcqHRFzusXKp eb2v8MCM75a0aD2hJ4damoEQ= Date: Sun, 14 Aug 2022 16:19:02 +0200 From: =?utf-8?b?0L3QsNCx?= To: dash@vger.kernel.org, Harald van Dijk Subject: [PATCH v2 1/2] man: printf: reword to avoid confusion v/v Ar argument[s]/arguments Message-ID: <0d82c5a6b3a84ec3c6f5eb3513e4800f2f1f601a.1660486650.git.nabijaczleweli@nabijaczleweli.xyz> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20220429 Precedence: bulk List-ID: X-Mailing-List: dash@vger.kernel.org The current wording says that given printf a b c d a is the format, c and d are processed as noted, but b is unspecified Signed-off-by: Ahelenia Ziemiańska --- That's a fair cop, considering the manual already abuses Ar arguments as Ar argument and arguments. Please consider the following updated patch, which calls the arguments consumed by conversions "value"s. src/dash.1 | 29 +++++++++++++---------------- 1 file changed, 13 insertions(+), 16 deletions(-) diff --git a/src/dash.1 b/src/dash.1 index ff02237..310f34e 100644 --- a/src/dash.1 +++ b/src/dash.1 @@ -1528,30 +1528,26 @@ With the option specified the output will be formatted suitably for non-interactive use. .\".Pp .It Xo printf Ar format -.Op Ar arguments ... +.Oo Ar value Oc Ns ... .Xc .Ic printf -formats and prints its arguments, after the first, under control -of the -.Ar format . -The -.Ar format -is a character string which contains three types of objects: plain characters, +formats and prints its arguments according to +.Ar format , +a character string which contains three types of objects: plain characters, which are simply copied to standard output, character escape sequences which are converted and copied to the standard output, and format specifications, each of which causes printing of the next successive -.Ar argument . +.Ar value . .Pp -The -.Ar arguments -after the first are treated as strings if the corresponding format is +Each +.Ar value +is treated as a string if the corresponding format specification is either .Cm b , -.Cm c +.Cm c , or .Cm s ; -otherwise it is evaluated as a C constant, with the following extensions: -.Pp +otherwise it is evaluated as a C constant, with the following additions: .Bl -bullet -offset indent -compact .It A leading plus or minus sign is allowed. @@ -1561,8 +1557,9 @@ If the leading character is a single or double quote, the value is the code of the next character. .El .Pp -The format string is reused as often as necessary to satisfy the -.Ar arguments . +The format string is reused as often as necessary until all +.Ar value Ns s +are consumed. Any extra format specifications are evaluated with zero or the null string. .Pp From patchwork Sun Aug 14 14:19:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?0L3QsNCx?= X-Patchwork-Id: 12942904 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96B24C3F6B0 for ; Sun, 14 Aug 2022 14:19:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231866AbiHNOTK (ORCPT ); Sun, 14 Aug 2022 10:19:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231268AbiHNOTJ (ORCPT ); Sun, 14 Aug 2022 10:19:09 -0400 Received: from tarta.nabijaczleweli.xyz (unknown [139.28.40.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E3588DED2 for ; Sun, 14 Aug 2022 07:19:08 -0700 (PDT) Received: from tarta.nabijaczleweli.xyz (unknown [192.168.1.250]) by tarta.nabijaczleweli.xyz (Postfix) with ESMTPSA id 22F361334; Sun, 14 Aug 2022 16:19:08 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=nabijaczleweli.xyz; s=202205; t=1660486748; bh=aIo9lbtxL8JOAgFX2rC44I4tFexK3ClRBEXR2xsBNDc=; h=Date:From:To:Subject:References:In-Reply-To:From; b=uctEt+t52DU/MTBMlSMTOVf8DgzVwD9NrYPGAT/lcXhs8FzVm5X2sJy/YfRfUfJvC XnQ67WWwnSsyXdfrQ2wlqVoFM7jOVkM9L1VeFibd9E0D+JQsijKmhbJB4Zle9ABNkC BLOMgg2v1eFovZlClatsyCsDBsyzKjl5XZDMbm7WbsRLnFP4Z0F3tIP7ElBxPz9PdL JhGH558SMrKubqEUAg/R6psxPnUY81xLZ+P9Y7hSRs+yFljHML/faNbek+PArxNrrW PkM8+nyXp9R6CVhn4AGxRZOZRQ2Ad/RGstVLJ7yM9P16WHrPnOaH1ZnElSqo0wqdHr L3YCG8W4bdLyimJp6FcV/da+ai7xe5HeJvptZ9S+uuCP8lG7BJjRMU5nhsD0QPwbxM YHwunzDRUFMEhHj6lO6meo17c55KJJwcGetE5LZqnwVIMJOwu83qn9MsIJsDBgTmfZ NpGaZJ2tGjEg0gIk4Ilt6GaccYoqP+3KfnDwGDPF9ZtfMxvS8h/Xf8sG8iLc3U1lFG R08WpkYUTe6lwLxOrgHSz6sl3vQ5ujscgBXaNLw8cxNEp7r5kwr8FESx7S/wOwo1w5 +XQYBxhYaCeWLFME+LBt8WhrUYFYaHt9lUPaqZKFVDXcoEsvauWShjHCbnaOQydpmR uHW4oUzSOGa49ZkWegHvXx3o= Date: Sun, 14 Aug 2022 16:19:07 +0200 From: =?utf-8?b?0L3QsNCx?= To: dash@vger.kernel.org, Harald van Dijk Subject: [PATCH v2 2/2] man: printf: in 'X, X is a byte under dash Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20220429 Precedence: bulk List-ID: X-Mailing-List: dash@vger.kernel.org Multiple issues: * the encoding is not always ASCII * what ASCII code is assigned to я * dash isn't internationalised (this is nonconformant but out of scope), and uses the next /byte/; in a UTF-8 locale: $ printf %d\\n \'ą 196 $ printf %d\\n \'я 196 this is in contrast to POSIX (and bash), which says: > If the leading character is a single-quote or double-quote, > the value shall be the numeric value in the underlying codeset > of the character following the single-quote or double-quote. (i.e. mbrtowc(&val, argv[n], ...)) Signed-off-by: Ahelenia Ziemiańska --- src/dash.1 | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/dash.1 b/src/dash.1 index 310f34e..38cf020 100644 --- a/src/dash.1 +++ b/src/dash.1 @@ -1552,9 +1552,7 @@ otherwise it is evaluated as a C constant, with the following additions: .It A leading plus or minus sign is allowed. .It -If the leading character is a single or double quote, the value is the -.Tn ASCII -code of the next character. +If the leading character is a single or double quote, the value of the next byte. .El .Pp The format string is reused as often as necessary until all