Message ID | f277a7a429db8f54fa06dd1965d62ec491e6d84b.1664287711.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | config API: return empty list, not NULL | expand |
On Tue, Sep 27 2022, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <derrickstolee@github.com> > > The git_die_config() method calls git_config_get_value_multi() but > immediately navigates to its first value without checking if the result > is NULL or empty. Callers should only call git_die_config() if there is > at least one value for the given 'key', but such a mistaken use might > slip through. It would be better to show a BUG() statement than a > possible segfault. > > Signed-off-by: Derrick Stolee <derrickstolee@github.com> > --- > config.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/config.c b/config.c > index bf89afbdab0..0c41606c7d4 100644 > --- a/config.c > +++ b/config.c > @@ -2833,8 +2833,13 @@ void git_die_config(const char *key, const char *err, ...) > va_end(params); > } > values = git_config_get_value_multi(key); > - kv_info = values->items[values->nr - 1].util; > - git_die_config_linenr(key, kv_info->filename, kv_info->linenr); > + > + if (values && values->nr) { > + kv_info = values->items[values->nr - 1].util; > + git_die_config_linenr(key, kv_info->filename, kv_info->linenr); > + } else { > + BUG("expected a non-empty list of values"); > + } > } > > /* AFAIKT the intent of the current code on "master" is that this will only get called if the likes of git_configset_get_string() returns < 0, not if it returns > 0. So isn't the combination of your 1/5 and this 3/5 now conflating these two conditions? See e.g. repo_config_get_string_tmp() and when it would call git_die_config(). I.e. isn't the whole point of git_die_config() to print an error message about a configuration *value* that we've parsed out of the config? If e.g. the key itself is bad we'll get a -1, but in this case it seems we would have a BUG(), but it's not that we "expected a non-empty list of values", but that the state of the world changed between our previous configset invocation, no?
On 9/27/2022 12:17 PM, Ævar Arnfjörð Bjarmason wrote: > > On Tue, Sep 27 2022, Derrick Stolee via GitGitGadget wrote: > >> From: Derrick Stolee <derrickstolee@github.com> >> >> The git_die_config() method calls git_config_get_value_multi() but >> immediately navigates to its first value without checking if the result >> is NULL or empty. Callers should only call git_die_config() if there is >> at least one value for the given 'key', but such a mistaken use might >> slip through. It would be better to show a BUG() statement than a >> possible segfault. > AFAIKT the intent of the current code on "master" is that this will only > get called if the likes of git_configset_get_string() returns < 0, not > if it returns > 0. > > So isn't the combination of your 1/5 and this 3/5 now conflating these > two conditions? See e.g. repo_config_get_string_tmp() and when it would > call git_die_config(). > > I.e. isn't the whole point of git_die_config() to print an error message > about a configuration *value* that we've parsed out of the config? > > If e.g. the key itself is bad we'll get a -1, but in this case it seems > we would have a BUG(), but it's not that we "expected a non-empty list > of values", but that the state of the world changed between our previous > configset invocation, no? If git_die_config() was static to config.c, then I would agree with you that its use is controlled enough to avoid that possibility. However, it is available in config.h and its doc comment does not say anything about "make sure 'key' properly parses at least one value". The point is defense-in-depth to prevent a segfault if someone adds an incorrect caller. It looks too easy to add an erroneous caller. Thanks, -Stolee
On Tue, Sep 27 2022, Derrick Stolee wrote: > On 9/27/2022 12:17 PM, Ævar Arnfjörð Bjarmason wrote: >> >> On Tue, Sep 27 2022, Derrick Stolee via GitGitGadget wrote: >> >>> From: Derrick Stolee <derrickstolee@github.com> >>> >>> The git_die_config() method calls git_config_get_value_multi() but >>> immediately navigates to its first value without checking if the result >>> is NULL or empty. Callers should only call git_die_config() if there is >>> at least one value for the given 'key', but such a mistaken use might >>> slip through. It would be better to show a BUG() statement than a >>> possible segfault. > >> AFAIKT the intent of the current code on "master" is that this will only >> get called if the likes of git_configset_get_string() returns < 0, not >> if it returns > 0. >> >> So isn't the combination of your 1/5 and this 3/5 now conflating these >> two conditions? See e.g. repo_config_get_string_tmp() and when it would >> call git_die_config(). >> >> I.e. isn't the whole point of git_die_config() to print an error message >> about a configuration *value* that we've parsed out of the config? >> >> If e.g. the key itself is bad we'll get a -1, but in this case it seems >> we would have a BUG(), but it's not that we "expected a non-empty list >> of values", but that the state of the world changed between our previous >> configset invocation, no? > > If git_die_config() was static to config.c, then I would agree with you > that its use is controlled enough to avoid that possibility. However, it > is available in config.h and its doc comment does not say anything about > "make sure 'key' properly parses at least one value". It does, it's part of the configset API, which any non-trivial user understands adhares to the semantics of the configset cache. E.g. the fast-import.c caller is doing, basically: if (!git_config_get_int("pack.indexversion", &indexversion_value)) /* ...if we're not happy with indexversion_value ...*/ git_die_config("pack.indexversion", ...); Anyway, I'm all for adding some extra paranoia with a BUG(), it was just unclear to me if this was some segfault we were expecting to perhaps have now, or with API use after this series. AFAICT it's really neither of those, but just some side-paranoia added while-at-it for a *different* use-case than the primary goal of this series. I.e. one where you're looking up a key you don't know exists, whereas git_die_config() is for a key we *know* exists, but don't like its value. I think it's better if we just clarify that it's supposed to be used by that in the API docs, and if you insist on the BUG then the: BUG("expected a non-empty list of values"); Should really get to the point directly, and say something like: BUG("called with non-existing key '%s'? " "Call git_die_config(...) only with keys whose value you don't like", key);
diff --git a/config.c b/config.c index bf89afbdab0..0c41606c7d4 100644 --- a/config.c +++ b/config.c @@ -2833,8 +2833,13 @@ void git_die_config(const char *key, const char *err, ...) va_end(params); } values = git_config_get_value_multi(key); - kv_info = values->items[values->nr - 1].util; - git_die_config_linenr(key, kv_info->filename, kv_info->linenr); + + if (values && values->nr) { + kv_info = values->items[values->nr - 1].util; + git_die_config_linenr(key, kv_info->filename, kv_info->linenr); + } else { + BUG("expected a non-empty list of values"); + } } /*