diff mbox series

[GSoC,1/2] userdiff: add builtin driver for gitconfig syntax

Message ID 20250319172016.2115-2-lucasseikioshiro@gmail.com (mailing list archive)
State New
Headers show
Series add userdiff driver for gitconfig | expand

Commit Message

Lucas Seiki Oshiro March 19, 2025, 5:20 p.m. UTC
From Documentation/config.adoc:

"""
The file consists of sections and variables. A section begins with
the name of the section in square brackets and continues until the next
section begins. Section names are case-insensitive. Only alphanumeric
characters, `-` and `.` are allowed in section names. Each variable
must belong to some section, which means that there must be a section
header before the first setting of a variable.

[...]

Subsection names are case sensitive and can contain any characters except
newline and the null byte.

The variable names are case-insensitive, allow only alphanumeric characters
and `-`, and must start with an alphabetic character.
"""

Then, add a new builtin driver for gitconfig files, where:

- the funcname regular expression matches sections and subsections,
  i. e. the pattern [SECTION] or [SECTION "SUBSECTION"], where the
  section is composed by alphanumeric numbers, `-` and `.`, and
  subsection names may be composed by any characters;

- word_regex is more permissive, matching any word with one or more
  non-whitespace characters.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
---
 userdiff.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Patrick Steinhardt March 20, 2025, 8:38 a.m. UTC | #1
On Wed, Mar 19, 2025 at 02:20:15PM -0300, Lucas Seiki Oshiro wrote:
> From Documentation/config.adoc:
> 
> """
> The file consists of sections and variables. A section begins with
> the name of the section in square brackets and continues until the next
> section begins. Section names are case-insensitive. Only alphanumeric
> characters, `-` and `.` are allowed in section names. Each variable
> must belong to some section, which means that there must be a section
> header before the first setting of a variable.
> 
> [...]
> 
> Subsection names are case sensitive and can contain any characters except
> newline and the null byte.
> 
> The variable names are case-insensitive, allow only alphanumeric characters
> and `-`, and must start with an alphabetic character.
> """

I don't think it's necessary to quote this whole paragraph here, as most
of us should be quite familiar with its format. I'd rather summarize the
info a bit and explain how we can use the userdiff patterns for the
general structure of the config. And in case there are any subtleties in
the format it may make sense to specifically point out those instead of
quoting the whole manual.

> Then, add a new builtin driver for gitconfig files, where:
> 
> - the funcname regular expression matches sections and subsections,
>   i. e. the pattern [SECTION] or [SECTION "SUBSECTION"], where the
>   section is composed by alphanumeric numbers, `-` and `.`, and
>   subsection names may be composed by any characters;

Okay, makes sense.

> - word_regex is more permissive, matching any word with one or more
>   non-whitespace characters.

It would be nice to provide context _why_ it is more permissive and what
the effect is.

The order of the commit message in our project is typically a bit
different than what you have here: we first explain the actual problem
that we aim to solve before discussing how you solve it.

The code change itself looks sensible to me.

Patrick
D. Ben Knoble March 21, 2025, 2:11 a.m. UTC | #2
On Thu, Mar 20, 2025 at 4:41 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Wed, Mar 19, 2025 at 02:20:15PM -0300, Lucas Seiki Oshiro wrote:
> > From Documentation/config.adoc:
> >
> > """
> > The file consists of sections and variables. A section begins with
> > the name of the section in square brackets and continues until the next
> > section begins. Section names are case-insensitive. Only alphanumeric
> > characters, `-` and `.` are allowed in section names. Each variable
> > must belong to some section, which means that there must be a section
> > header before the first setting of a variable.
> >
> > [...]
> >
> > Subsection names are case sensitive and can contain any characters except
> > newline and the null byte.
> >
> > The variable names are case-insensitive, allow only alphanumeric characters
> > and `-`, and must start with an alphabetic character.
> > """
>
> I don't think it's necessary to quote this whole paragraph here, as most
> of us should be quite familiar with its format. I'd rather summarize the
> info a bit and explain how we can use the userdiff patterns for the
> general structure of the config. And in case there are any subtleties in
> the format it may make sense to specifically point out those instead of
> quoting the whole manual.

And, if we really felt it important to direct readers to the full
text, we could instruct them to do something like `git show
<sensible-hash>:Documentation/config.adoc`—in other words, the text is
a part of this commit (and its parent) even if we don't (fully) quote
it here.

Cheers,
Ben
diff mbox series

Patch

diff --git a/userdiff.c b/userdiff.c
index 340c4eb4f7..5bbcc2b690 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -198,6 +198,10 @@  IPATTERN("fountain",
 	 "^((\\.[^.]|(int|ext|est|int\\.?/ext|i/e)[. ]).*)$",
 	 /* -- */
 	 "[^ \t-]+"),
+PATTERNS("gitconfig",
+         "^\\[[a-zA-Z0-9]+\\]|\\[[a-zA-Z0-9]+[ \t]+\".+\"\\]$",
+         /* -- */
+         "[^ \t]+"),
 PATTERNS("golang",
 	 /* Functions */
 	 "^[ \t]*(func[ \t]*.*(\\{[ \t]*)?)\n"