Message ID | c75806d011b04f2ad7efbbec01613a2d0b1f570b.1584477196.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | upload-pack.c: limit allowed filter choices | expand |
On Tue, Mar 17, 2020 at 4:40 PM Taylor Blau <me@ttaylorr.com> wrote: > In a subsequent commit, we will add configuration options that are > specific to each kind of object filter, in which case it is handy to > have a function that translates between 'enum > list_objects_filter_choice' and an appropriate configuration-friendly > string. > --- Missing sign-off (but perhaps that's intentional since this is RFC). > diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c > @@ -15,6 +15,31 @@ static int parse_combine_filter( > +const char *list_object_filter_config_name(enum list_objects_filter_choice c) > +{ > + switch (c) { > + case LOFC_BLOB_NONE: > + return "blob:none"; > + case LOFC_BLOB_LIMIT: > + return "blob:limit"; > + case LOFC_TREE_DEPTH: > + return "tree:depth"; > + case LOFC_SPARSE_OID: > + return "sparse:oid"; > + case LOFC_COMBINE: > + return "combine"; > + case LOFC_DISABLED: > + case LOFC__COUNT: > + /* > + * Include these to catch all enumerated values, but > + * break to treat them as a bug. Any new values of this > + * enum will cause a compiler error, as desired. > + */ In general, people will see a warning, not an error, unless they specifically use -Werror (or such) to turn the warning into an error, so this statement is misleading. Also, while some compilers may complain, others may not. So, although the comment claims that we will notice an unhandled enum constant at compile-time, that isn't necessarily the case. Moreover, the comment itself, in is present form, is rather superfluous since its merely repeating what the BUG() invocation just below it already tells me. In fact, as a reader of this code, I would be more interested in knowing why those two cases do not have string equivalents which are returned (although perhaps even that would be obvious to someone familiar with the code, hence the comment can probably be dropped altogether). > + break; > + } > + BUG("list_object_filter_choice_name: invalid argument '%d'", c); > +}
On Tue, Mar 17, 2020 at 04:53:44PM -0400, Eric Sunshine wrote: > > + case LOFC_DISABLED: > > + case LOFC__COUNT: > > + /* > > + * Include these to catch all enumerated values, but > > + * break to treat them as a bug. Any new values of this > > + * enum will cause a compiler error, as desired. > > + */ > > In general, people will see a warning, not an error, unless they > specifically use -Werror (or such) to turn the warning into an error, > so this statement is misleading. Also, while some compilers may > complain, others may not. So, although the comment claims that we will > notice an unhandled enum constant at compile-time, that isn't > necessarily the case. Yes, but that's the best we can do, isn't it? There's sort of a meta-issue here which Taylor and I discussed off-list and which led to this comment. We quite often write switch statements over enums like this: switch (foo) case FOO_ONE: ...do something... case FOO_TWO: ...something else... default: BUG("I don't know what to do with %d", foo); } That's reasonable and does the right thing at runtime if we ever hit this case. But it has the unfortunate side effect that we lose any -Wswitch warning that could tell us at compile time that we're missing a case. Not everybody would see such a warning, as you note, but developers on gcc and clang generally would (it's part of -Wall). But we can't just remove the default case. Even though enums don't generally take on other values, it's legal for them to do so. So we do want to make sure we BUG() in that instance. This is awkward to solve in the general case[1]. But because we're returning in each case arm here, it's easy to just put the BUG() after the switch. Anything that didn't return is unhandled, and we get the best of both: -Wswitch warnings when we need to add a new filter type, and a BUG() in the off chance that we see an unexpected value. But the cost is that we have to enumerate the set of values that are defined but not handled here (LOFC__COUNT, for instance, isn't a real enum value but rather a placeholder to let other code know how many filter types there are). So...I dunno. Worth it as a general technique? -Peff [1] In the general case where you don't return, you have to somehow know whether the value was actually handled or not (and BUG() if it wasn't). Presumably by keeping a separate flag variable, which is pretty ugly. -Wswitch-enum is supposed to deal with this by requiring that you list all of the values even if you have a default case. But it triggers in a lot of other places in the code that I think would be made much harder to read by having to list out the enumerated possibilities.
Jeff King <peff@peff.net> writes: > But the cost is that we have to enumerate the set of values that are > defined but not handled here (LOFC__COUNT, for instance, isn't a real > enum value but rather a placeholder to let other code know how many > filter types there are). > > So...I dunno. Worth it as a general technique? "This is a possible value in the enum we are switching on, so I write a case arm for it, but we do nothing for it here" is OK, but if it were "we do nothing for it here or anywhere" (i.e. the maximum enum value defined as a sentinel), the resulting code would be ugly. I am not sure if the tradeoff is good to force such an ugliness on readers' eyes to squelch the -Wswitch warnings. So, I dunno.
Hi Eric, On Tue, Mar 17, 2020 at 04:53:44PM -0400, Eric Sunshine wrote: > On Tue, Mar 17, 2020 at 4:40 PM Taylor Blau <me@ttaylorr.com> wrote: > > In a subsequent commit, we will add configuration options that are > > specific to each kind of object filter, in which case it is handy to > > have a function that translates between 'enum > > list_objects_filter_choice' and an appropriate configuration-friendly > > string. > > --- > > Missing sign-off (but perhaps that's intentional since this is RFC). Yes, the missing sign-off (in this patch as well as 2/2) is intentional, since this is an RFC. Sorry for not calling this out more clearly in my cover. Thanks, Taylor
On Wed, Mar 18, 2020 at 6:03 AM Jeff King <peff@peff.net> wrote: > On Tue, Mar 17, 2020 at 04:53:44PM -0400, Eric Sunshine wrote: > > > + case LOFC__COUNT: > > > + /* > > > + * Include these to catch all enumerated values, but > > > + * break to treat them as a bug. Any new values of this > > > + * enum will cause a compiler error, as desired. > > > + */ > > > > In general, people will see a warning, not an error, unless they > > specifically use -Werror (or such) to turn the warning into an error, > > so this statement is misleading. Also, while some compilers may > > complain, others may not. So, although the comment claims that we will > > notice an unhandled enum constant at compile-time, that isn't > > necessarily the case. > > Yes, but that's the best we can do, isn't it? To be clear, I wasn't questioning the code structure at all. I was specifically referring to the comment talking about "error" when it should say "warning" or "possible warning". Moreover, normally, we use comments to highlight something in the code which is not obvious or straightforward, so I was questioning whether this comment is even helpful since the code seems reasonably clear. And... > But we can't just remove the default case. Even though enums don't > generally take on other values, it's legal for them to do so. So we do > want to make sure we BUG() in that instance. > > This is awkward to solve in the general case[1]. But because we're > returning in each case arm here, it's easy to just put the BUG() after > the switch. Anything that didn't return is unhandled, and we get the > best of both: -Wswitch warnings when we need to add a new filter type, > and a BUG() in the off chance that we see an unexpected value. > > So...I dunno. Worth it as a general technique? ...if this is or will become an idiom we want in this codebase, then it would be silly to write an explanatory comment every place we employ it. Instead, a document such as CodingGuidelines would likely be a better fit for such knowledge.
On Wed, Mar 18, 2020 at 06:38:49PM -0400, Eric Sunshine wrote: > To be clear, I wasn't questioning the code structure at all. I was > specifically referring to the comment talking about "error" when it > should say "warning" or "possible warning". > > Moreover, normally, we use comments to highlight something in the code > which is not obvious or straightforward, so I was questioning whether > this comment is even helpful since the code seems reasonably clear. > And... OK, I agree with all that. :) > ...if this is or will become an idiom we want in this codebase, then > it would be silly to write an explanatory comment every place we > employ it. Instead, a document such as CodingGuidelines would likely > be a better fit for such knowledge. Yeah, that makes sense. If we do use this technique, though, we'll have to explicitly list "case" lines for the enum values which are meant to break out to the BUG(). And there it _is_ worth commenting on "yes, we know about this value but it is not handled here because...". Which is what you asked for in your original message. :) Something like: switch (c) { case LOFC_BLOB_NONE: return "blob:none": ..etc... case LOFC__COUNT: /* not a real filter type; just a marker for counting the number */ break; case LOFC_DISABLED: /* we have no name for "no filter at all" */ break; } BUG(...); -Peff
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index 256bcfbdfe..6b6aa0b3ec 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -15,6 +15,31 @@ static int parse_combine_filter( const char *arg, struct strbuf *errbuf); +const char *list_object_filter_config_name(enum list_objects_filter_choice c) +{ + switch (c) { + case LOFC_BLOB_NONE: + return "blob:none"; + case LOFC_BLOB_LIMIT: + return "blob:limit"; + case LOFC_TREE_DEPTH: + return "tree:depth"; + case LOFC_SPARSE_OID: + return "sparse:oid"; + case LOFC_COMBINE: + return "combine"; + case LOFC_DISABLED: + case LOFC__COUNT: + /* + * Include these to catch all enumerated values, but + * break to treat them as a bug. Any new values of this + * enum will cause a compiler error, as desired. + */ + break; + } + BUG("list_object_filter_choice_name: invalid argument '%d'", c); +} + /* * Parse value of the argument to the "filter" keyword. * On the command line this looks like: diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h index 2ffb39222c..e5259e4ac6 100644 --- a/list-objects-filter-options.h +++ b/list-objects-filter-options.h @@ -17,6 +17,12 @@ enum list_objects_filter_choice { LOFC__COUNT /* must be last */ }; +/* + * Returns a configuration key suitable for describing the given object filter, + * e.g.: "blob:none", "combine", etc. + */ +const char *list_object_filter_config_name(enum list_objects_filter_choice c); + struct list_objects_filter_options { /* * 'filter_spec' is the raw argument value given on the command line