Message ID | 20200410151032.23198-1-martin.agren@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | merge: use skip_prefix to parse config key | expand |
On Fri, Apr 10, 2020 at 05:10:32PM +0200, Martin Ågren wrote: > Instead of using `starts_with()`, the magic number 7, `strlen()` and a > fair number of additions to verify the three parts of the config key > "branch.<branch>.mergeoptions", use `skip_prefix()` to jump through them > more explicitly. The conversion looks correct to me and is certainly an improvement. > We need to introduce a new variable for this (we certainly can't modify > `k` just because we see "branch."!). With `skip_prefix()` we often use > quite bland names like `p` or `str`. Let's do the same. If and when this > function needs to do more prefix-skipping, we'll have a generic variable > ready for this. I was about to comment on this in the patch, but your explanation preempted me. :) The logic here makes sense. > diff --git a/builtin/merge.c b/builtin/merge.c > index d127d2225f..bde5f14f05 100644 > --- a/builtin/merge.c > +++ b/builtin/merge.c > @@ -597,10 +597,10 @@ static void parse_branch_merge_options(char *bmo) > static int git_merge_config(const char *k, const char *v, void *cb) > { > int status; > + const char *str; > > - if (branch && starts_with(k, "branch.") && > - starts_with(k + 7, branch) && > - !strcmp(k + 7 + strlen(branch), ".mergeoptions")) { > + if (branch && skip_prefix(k, "branch.", &str) && > + skip_prefix(str, branch, &str) && !strcmp(str, ".mergeoptions")) { > free(branch_mergeoptions); > branch_mergeoptions = xstrdup(v); > return 0; In general, parsing subsections accurately involves looking from both the start and the end of the string, pulling out the section and key and leaving the rest in the middle. But I think we can get away with this left-to-right parsing because we're only interested in matching a _specific_ subsection name, and a specific key. So there are no cases it will handle incorrectly. The more general form would be: const char *subsection, *key; int subsection_len; if (!parse_config_key("branch", &subsection, &subsection_len, &key) && subsection_len == strlen(branch) && !strncmp(subsection, branch) && !strcmp(key, "mergeoptions")) ... but that's a bit more awkward (it would be less so if we had a helper function for comparing a NUL-terminated string against a ptr/len pair). -Peff
Jeff King <peff@peff.net> writes: > In general, parsing subsections accurately involves looking from both > the start and the end of the string, pulling out the section and key and > leaving the rest in the middle. But I think we can get away with this > left-to-right parsing because we're only interested in matching a > _specific_ subsection name, and a specific key. So there are no cases it > will handle incorrectly. In other words, if k were "branch.A.B.mergeoptions", it can only be the 'branch.*.mergeoptions' variable attached to branch "A.B", but when checking for branch=="A", the first two skip_prefix() would pass and the only thing that protects us from misparsing is that "B.mergeoptions" is not what we are looking for. > The more general form would be: > > const char *subsection, *key; > int subsection_len; > > if (!parse_config_key("branch", &subsection, &subsection_len, &key) && > subsection_len == strlen(branch) && !strncmp(subsection, branch) && > !strcmp(key, "mergeoptions")) > ... > > but that's a bit more awkward (it would be less so if we had a helper > function for comparing a NUL-terminated string against a ptr/len pair). Yes, but even with such a helper, i.e. if (branch && !parse_config_key("branch", &sub, &sublen, &key) && !spanstrcmp(sub, sublen, branch) && !strcmp(key, "mergeoptions")) what Martin wrote, especially if it is reflowed to match the above, i.e. if (branch && skip_prefix(key, "branch.", &sub) && skip_prefix(sub, branch, &sub) && !strcmp(sub, ".mergeoptions") I find it just as, if not more, easy to read. Where the parse_config_key() helper shines, I think, is when we do not have the middle level to compare against, and in that case, we must work only from the given key, scanning from both ends for dot. Thanks.
On Fri, Apr 10, 2020 at 09:44:56AM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > In general, parsing subsections accurately involves looking from both > > the start and the end of the string, pulling out the section and key and > > leaving the rest in the middle. But I think we can get away with this > > left-to-right parsing because we're only interested in matching a > > _specific_ subsection name, and a specific key. So there are no cases it > > will handle incorrectly. > > In other words, if k were "branch.A.B.mergeoptions", it can only be > the 'branch.*.mergeoptions' variable attached to branch "A.B", but > when checking for branch=="A", the first two skip_prefix() would > pass and the only thing that protects us from misparsing is that > "B.mergeoptions" is not what we are looking for. Yes, exactly. > > The more general form would be: > [...] > Yes, but even with such a helper, i.e. > [...] > what Martin wrote, especially if it is reflowed to match the above, i.e. > [...] > I find it just as, if not more, easy to read. Yeah, sorry if I was unclear; I think the left-to-right is perfectly fine for this case. > Where the parse_config_key() helper shines, I think, is when we do > not have the middle level to compare against, and in that case, we > must work only from the given key, scanning from both ends for dot. Agreed. Another option for known-value cases like this is to do it outside of the callback handler, like: char *key = xstrfmt("branch.%s.mergeoptions"); const char *value; if (!git_config_get_string_const(key, &value)) ... free(key); The allocation is a bit awkward, though we could hide that with a clever helper. Shifting from "iterate over and store config keys" to "look up config keys on demand" is a much bigger change, though. I would only do it if it made the rest of the code flow easier. -Peff
Jeff King <peff@peff.net> writes: > Shifting from "iterate over and store config keys" to "look up config > keys on demand" is a much bigger change, though. I would only do it if > it made the rest of the code flow easier. Yeah, absolutely. Thanks.
diff --git a/builtin/merge.c b/builtin/merge.c index d127d2225f..bde5f14f05 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -597,10 +597,10 @@ static void parse_branch_merge_options(char *bmo) static int git_merge_config(const char *k, const char *v, void *cb) { int status; + const char *str; - if (branch && starts_with(k, "branch.") && - starts_with(k + 7, branch) && - !strcmp(k + 7 + strlen(branch), ".mergeoptions")) { + if (branch && skip_prefix(k, "branch.", &str) && + skip_prefix(str, branch, &str) && !strcmp(str, ".mergeoptions")) { free(branch_mergeoptions); branch_mergeoptions = xstrdup(v); return 0;
Instead of using `starts_with()`, the magic number 7, `strlen()` and a fair number of additions to verify the three parts of the config key "branch.<branch>.mergeoptions", use `skip_prefix()` to jump through them more explicitly. We need to introduce a new variable for this (we certainly can't modify `k` just because we see "branch."!). With `skip_prefix()` we often use quite bland names like `p` or `str`. Let's do the same. If and when this function needs to do more prefix-skipping, we'll have a generic variable ready for this. Adjust the indentation while we're here. Signed-off-by: Martin Ågren <martin.agren@gmail.com> --- worktree.c also matches "strcmp.*strlen" and could see a similar patch. I'm working on a longer series for that file and this fell out of that work. This feels independent enough that I'm posting it on its own. builtin/merge.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)