diff mbox series

[v3,4/4] config.txt: describe handling of whitespace further

Message ID e389acbfacd5046a926b87346d41f9c7962e3c23.1710800549.git.dsimic@manjaro.org (mailing list archive)
State Superseded
Headers show
Series Fix a bug in configuration parsing, and improve tests and documentation | expand

Commit Message

Dragan Simic March 18, 2024, 10:24 p.m. UTC
Make it more clear what the whitespace characters are in the context of git
configuration files, and improve the description of the trailing whitespace
handling a bit, especially how it works out together with the presence of
inline comments.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
---

Notes:
    Changes in v3:
        - Patch description was expanded a bit, to make it more on point
        - No changes to the documentation were introduced
    
    Changes in v2:
        - No changes were introduced

 Documentation/config.txt | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

Comments

Junio C Hamano March 20, 2024, 7:12 a.m. UTC | #1
Dragan Simic <dsimic@manjaro.org> writes:

>  A line that defines a value can be continued to the next line by
> +ending it with a `\`; the backslash and the end-of-line are stripped.
> +Leading whitespace characters after 'name =', the remainder of the
>  line after the first comment character '#' or ';', and trailing
> +whitespace characters of the line are discarded unless they are enclosed
> +in double quotes.

Can we directly tighten the "trailing..." part, instead of having to
add an extra long sentence ...

> +The discarding of the trailing whitespace characters
> +applies regardless of the discarding of the portion of the line after
> +the first comment character.

... like this as an attempt to clarify?

    Leading whitespace characters before and after 'name =', and the
    remainder of the line after the first comment character '#' or
    ';', are removed, and then trailing whitespace characters at the
    end of the line are discarded.

By the way, if a run of whitespace characters are enclosed in double
quotes, they cannot be trailing at the end of the line, as the
closing double quote is not a whitespace character, so it is out of
place to talk about quoted string in the context of trailing blank
removal.  The unquoting would want to be discussed separately.

> +Internal whitespace characters within the
> +value are retained verbatim.

Good.

>  
>  Inside double quotes, double quote `"` and backslash `\` characters
>  must be escaped: use `\"` for `"` and `\\` for `\`.

Thanks for working on this topic.
Dragan Simic March 20, 2024, 7:23 a.m. UTC | #2
On 2024-03-20 08:12, Junio C Hamano wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
> 
>>  A line that defines a value can be continued to the next line by
>> +ending it with a `\`; the backslash and the end-of-line are stripped.
>> +Leading whitespace characters after 'name =', the remainder of the
>>  line after the first comment character '#' or ';', and trailing
>> +whitespace characters of the line are discarded unless they are 
>> enclosed
>> +in double quotes.
> 
> Can we directly tighten the "trailing..." part, instead of having to
> add an extra long sentence ...

Makes sense, to make it less convoluted.

>> +The discarding of the trailing whitespace characters
>> +applies regardless of the discarding of the portion of the line after
>> +the first comment character.
> 
> ... like this as an attempt to clarify?
> 
>     Leading whitespace characters before and after 'name =', and the

Hmm, "leading whitespace" and "after" don't go very well together.
Such a construct seems a bit confusing, because it implies there's
something else after, which the leading whitespace refers to, which
may or may not be easily understandable to the users.

I'll think about how to rephrase this a bit better.

>     remainder of the line after the first comment character '#' or
>     ';', are removed, and then trailing whitespace characters at the
>     end of the line are discarded.
> 
> By the way, if a run of whitespace characters are enclosed in double
> quotes, they cannot be trailing at the end of the line, as the
> closing double quote is not a whitespace character, so it is out of
> place to talk about quoted string in the context of trailing blank
> removal.  The unquoting would want to be discussed separately.

I'll think about this as well.

>>  Inside double quotes, double quote `"` and backslash `\` characters
>>  must be escaped: use `\"` for `"` and `\\` for `\`.
> 
> Thanks for working on this topic.

Thank you for your highly detailed reviews!
Junio C Hamano March 20, 2024, 2:42 p.m. UTC | #3
Dragan Simic <dsimic@manjaro.org> writes:

>>     Leading whitespace characters before and after 'name =', and the
> Hmm, "leading whitespace" and "after" don't go very well together.

True.  We can drop "leading" of course.  I meant to refer to, in
this sample illustration,

[section]
	var = value # comment

the fact that "\t" before "var =" is discarded, and " " after "var ="
before "value" is also discarded.

Thanks.
Dragan Simic March 20, 2024, 4:17 p.m. UTC | #4
On 2024-03-20 15:42, Junio C Hamano wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
> 
>>>     Leading whitespace characters before and after 'name =', and the
>> Hmm, "leading whitespace" and "after" don't go very well together.
> 
> True.  We can drop "leading" of course.

Good point.

> I meant to refer to, in this sample illustration,
> 
> [section]
> 	var = value # comment
> 
> the fact that "\t" before "var =" is discarded, and " " after "var ="
> before "value" is also discarded.

Sure, thanks for the clarification.  I'll try to tweak the wording
a bit further.
diff mbox series

Patch

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 782c2bab906c..20f3300dc706 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -22,9 +22,10 @@  multivalued.
 Syntax
 ~~~~~~
 
-The syntax is fairly flexible and permissive; whitespaces are mostly
-ignored.  The '#' and ';' characters begin comments to the end of line,
-blank lines are ignored.
+The syntax is fairly flexible and permissive.  Whitespace characters,
+which in this context are the space character (SP) and the horizontal
+tabulation (HT), are mostly ignored.  The '#' and ';' characters begin
+comments to the end of line.  Blank lines are ignored.
 
 The file consists of sections and variables.  A section begins with
 the name of the section in square brackets and continues until the next
@@ -64,12 +65,14 @@  The variable names are case-insensitive, allow only alphanumeric characters
 and `-`, and must start with an alphabetic character.
 
 A line that defines a value can be continued to the next line by
-ending it with a `\`; the backslash and the end-of-line are
-stripped.  Leading whitespaces after 'name =', the remainder of the
+ending it with a `\`; the backslash and the end-of-line are stripped.
+Leading whitespace characters after 'name =', the remainder of the
 line after the first comment character '#' or ';', and trailing
-whitespaces of the line are discarded unless they are enclosed in
-double quotes.  Internal whitespaces within the value are retained
-verbatim.
+whitespace characters of the line are discarded unless they are enclosed
+in double quotes.  The discarding of the trailing whitespace characters
+applies regardless of the discarding of the portion of the line after
+the first comment character.  Internal whitespace characters within the
+value are retained verbatim.
 
 Inside double quotes, double quote `"` and backslash `\` characters
 must be escaped: use `\"` for `"` and `\\` for `\`.