Message ID | cover.1573670565.git.martin.agren@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | builtin/config: canonicalize "value_regex" with `--type=bool[-or-int]` | expand |
Martin Ågren <martin.agren@gmail.com> writes: > git config --type=bool --name-only --get-regexp '^foo\.' true > ... > This patch series teaches `git config` to canonicalize the incoming > "value_regex" ("true" in the example above), then canonicalize candidate > values as we go through the config. Or if you will, `git config` learns > a brand new type of regex, corresponding to the different ways there are > of spelling "true" and "false", respectively. Nice ;-) > `--type=bool-or-int` gets the same treatment, except we need to to be > able to handle the ints and regexes matching particular ints that we > must expect. Hmm, so I can say 1024k or 1m and that would match 1048576? Doubly nice. Looking forward to reading it thru.
On Wed, Nov 13, 2019 at 07:54:59PM +0100, Martin Ågren wrote: > To find all config items "foo.*" that are configured as the Boolean > value "true", you can try executing > > git config --type=bool --name-only --get-regexp '^foo\.' true > > ... and hope that you didn't spell "true" as "on", "True" or "1" back > when you populated your config. This shortcoming has been mentioned as a > left-over bit [1] [2]. This seems like a good idea, but I wonder why we'd limit it to bools? It seems like any type would probably be better off matching a normalized version. We already have two functions in builtin/config.c which handle types: - format_config() is for actually interpreting an existing value and writing it to stdout - normalize_value() is used when writing a new value into a config file, and normalizing what the user provided on the command-line I don't think you'd want to use format_config() here. For example, if I say: git config --type=color --get-regexp ^foo red I want to match the original "red" color, but _output_ the ANSI code. But normalize_value(), by contrast, leaves colors intact. So maybe it's a good match? OTOH, I'd probably expect to match expanded pathnames or expiration dates there, too, and it doesn't expand those. Those ones are less clear to me. The whole premise of this series is making an assumption that "--type" is about how you want to match, and not just about what you want to output. In your example above it's clear that you don't care about the output type (because you're using --name-only), but for: git config --type=path --get-regexp ^foo /home/peff it's unclear if you want to match a value of "~peff/foo", or if you simply want to output the expansion. I wonder if we'd want to allow specifying the output type and the matching type separately? Maybe that just makes it more awkward to use for little benefit (I admit that it's hard to imagine somebody wanting to normalize bools on output but _not_ for matching). Anyway. It would be nice if we could build on the existing logic in some way, rather than making a third place that has to handle every type we know about. > `--type=bool-or-int` gets the same treatment, except we need to to be > able to handle the ints and regexes matching particular ints that we > must expect. That said, even with `--type=bool` we can't move too > aggressively towards *requiring* that the incoming "value_regex" > canonializes as a Boolean value. The penultimate patch starts to warn on > non-canonicalizing values; the final patch makes us bail out entirely. > > The last patch is not meant for immediate inclusion, but I post it > anyway. I can re-submit it at an appropriate time, or maybe it could > slumber on pu until the time is ripe for completing the switch. I think bailing on values that can't be converted is normal for other code paths. E.g., just trying to print: $ git -c foo.bar=abc config --type=bool foo.abr fatal: bad numeric config value 'abc' for 'foo.bar': invalid unit -Peff
Hi Junio On Thu, 14 Nov 2019 at 03:19, Junio C Hamano <gitster@pobox.com> wrote: > > Martin Ågren <martin.agren@gmail.com> writes: > > > `--type=bool-or-int` gets the same treatment, except we need to to be > > able to handle the ints and regexes matching particular ints that we > > must expect. > > Hmm, so I can say 1024k or 1m and that would match 1048576? > > Doubly nice. > > Looking forward to reading it thru. Maybe you already noticed, but no, I didn't get to canonicalizing integers like that. What I meant was, type=bool-or-int learns to handle booleans similar to what I did to type=bool. I don't feel entirely satisfied by some of my commit message oneliners. They could make that a bit clearer, I think. Not directly related to your question, but I realize now that with type=bool-or-int, I only added the first of these example usages below as a test. The second one is perhaps just as interesting. $ ./git -c an.int=1 config --get --type=bool-or-int an.int 1 1 $ ./git -c an.int=1 config --get --type=bool-or-int an.int on 1 This last one emits "1". That's because by the time we've decided to output the value, `format_config()` has some logic around type=bool-or-int, but doesn't know about why exactly we selected this an.int=1 in the first place (git_parse_maybe_bool("1") == git_parse_maybe_bool_TEXT("on")). Just after thinking about this for a short while, I can't immediately say whether this second one should emit "1" or "true". My added documentation is actually vague enough to allow both of these to happen... I'll ponder this. Martin
On Thu, 14 Nov 2019 at 07:29, Jeff King <peff@peff.net> wrote: > This seems like a good idea, but I wonder why we'd limit it to bools? Basically because that's what the existing left-over-bits mentioned and it felt like a good place to start. But you're right in asking about the bigger picture up front. > It seems like any type would probably be better off matching a > normalized version. > > We already have two functions in builtin/config.c which handle types: > > - format_config() is for actually interpreting an existing value and > writing it to stdout > > - normalize_value() is used when writing a new value into a config > file, and normalizing what the user provided on the command-line > > I don't think you'd want to use format_config() here. I just speculated a little around format_config() in a reply to Junio. Already with my patch series and with type=bool-or-int, there's a visible funny-funky corner case where one hand doesn't know what the other is doing. > For example, if I > say: > > git config --type=color --get-regexp ^foo red > > I want to match the original "red" color, but _output_ the ANSI code. > But normalize_value(), by contrast, leaves colors intact. So maybe it's > a good match? > > OTOH, I'd probably expect to match expanded pathnames or expiration > dates there, too, and it doesn't expand those. Those ones are less clear > to me. The whole premise of this series is making an assumption that > "--type" is about how you want to match, Right. > and not just about what you > want to output. In your example above it's clear that you don't care > about the output type (because you're using --name-only), but for: > > git config --type=path --get-regexp ^foo /home/peff > > it's unclear if you want to match a value of "~peff/foo", or if you > simply want to output the expansion. Hmm, I feel like we're opening a can of worms here. > I wonder if we'd want to allow specifying the output type and the > matching type separately? Maybe that just makes it more awkward to use > for little benefit (I admit that it's hard to imagine somebody wanting > to normalize bools on output but _not_ for matching). > > Anyway. It would be nice if we could build on the existing logic in some > way, rather than making a third place that has to handle every type we > know about. Understood. Thanks a lot for sharing your thoughts. > > The last patch is not meant for immediate inclusion, but I post it > > anyway. I can re-submit it at an appropriate time, or maybe it could > > slumber on pu until the time is ripe for completing the switch. > > I think bailing on values that can't be converted is normal for other > code paths. E.g., just trying to print: > > $ git -c foo.bar=abc config --type=bool foo.abr > fatal: bad numeric config value 'abc' for 'foo.bar': invalid unit I'm not sure if you mean "... so we could be a lot more aggressive here"? I'm running now and I feel like I'll need to read your mail again tonight and get back to you in more detail. Thanks Martin
On Thu, Nov 14, 2019 at 07:54:35AM +0100, Martin Ågren wrote: > > > The last patch is not meant for immediate inclusion, but I post it > > > anyway. I can re-submit it at an appropriate time, or maybe it could > > > slumber on pu until the time is ripe for completing the switch. > > > > I think bailing on values that can't be converted is normal for other > > code paths. E.g., just trying to print: > > > > $ git -c foo.bar=abc config --type=bool foo.abr > > fatal: bad numeric config value 'abc' for 'foo.bar': invalid unit > > I'm not sure if you mean "... so we could be a lot more aggressive > here"? Yeah, I think it's OK to be aggressive with bailing when the user gave us a --type, but the value doesn't match it. > I'm running now and I feel like I'll need to read your mail again > tonight and get back to you in more detail. Sure, no rush and thanks for working on it. :) -Peff