diff mbox series

[RFC,1/2] list_objects_filter_options: introduce 'list_object_filter_config_name'

Message ID c75806d011b04f2ad7efbbec01613a2d0b1f570b.1584477196.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series upload-pack.c: limit allowed filter choices | expand

Commit Message

Taylor Blau March 17, 2020, 8:39 p.m. UTC
In a subsequent commit, we will add configuration options that are
specific to each kind of object filter, in which case it is handy to
have a function that translates between 'enum
list_objects_filter_choice' and an appropriate configuration-friendly
string.
---
 list-objects-filter-options.c | 25 +++++++++++++++++++++++++
 list-objects-filter-options.h |  6 ++++++
 2 files changed, 31 insertions(+)

Comments

Eric Sunshine March 17, 2020, 8:53 p.m. UTC | #1
On Tue, Mar 17, 2020 at 4:40 PM Taylor Blau <me@ttaylorr.com> wrote:
> In a subsequent commit, we will add configuration options that are
> specific to each kind of object filter, in which case it is handy to
> have a function that translates between 'enum
> list_objects_filter_choice' and an appropriate configuration-friendly
> string.
> ---

Missing sign-off (but perhaps that's intentional since this is RFC).

> diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
> @@ -15,6 +15,31 @@ static int parse_combine_filter(
> +const char *list_object_filter_config_name(enum list_objects_filter_choice c)
> +{
> +       switch (c) {
> +       case LOFC_BLOB_NONE:
> +               return "blob:none";
> +       case LOFC_BLOB_LIMIT:
> +               return "blob:limit";
> +       case LOFC_TREE_DEPTH:
> +               return "tree:depth";
> +       case LOFC_SPARSE_OID:
> +               return "sparse:oid";
> +       case LOFC_COMBINE:
> +               return "combine";
> +       case LOFC_DISABLED:
> +       case LOFC__COUNT:
> +               /*
> +                * Include these to catch all enumerated values, but
> +                * break to treat them as a bug. Any new values of this
> +                * enum will cause a compiler error, as desired.
> +                */

In general, people will see a warning, not an error, unless they
specifically use -Werror (or such) to turn the warning into an error,
so this statement is misleading. Also, while some compilers may
complain, others may not. So, although the comment claims that we will
notice an unhandled enum constant at compile-time, that isn't
necessarily the case.

Moreover, the comment itself, in is present form, is rather
superfluous since its merely repeating what the BUG() invocation just
below it already tells me. In fact, as a reader of this code, I would
be more interested in knowing why those two cases do not have string
equivalents which are returned (although perhaps even that would be
obvious to someone familiar with the code, hence the comment can
probably be dropped altogether).

> +               break;
> +       }
> +       BUG("list_object_filter_choice_name: invalid argument '%d'", c);
> +}
Jeff King March 18, 2020, 10:03 a.m. UTC | #2
On Tue, Mar 17, 2020 at 04:53:44PM -0400, Eric Sunshine wrote:

> > +       case LOFC_DISABLED:
> > +       case LOFC__COUNT:
> > +               /*
> > +                * Include these to catch all enumerated values, but
> > +                * break to treat them as a bug. Any new values of this
> > +                * enum will cause a compiler error, as desired.
> > +                */
> 
> In general, people will see a warning, not an error, unless they
> specifically use -Werror (or such) to turn the warning into an error,
> so this statement is misleading. Also, while some compilers may
> complain, others may not. So, although the comment claims that we will
> notice an unhandled enum constant at compile-time, that isn't
> necessarily the case.

Yes, but that's the best we can do, isn't it?

There's sort of a meta-issue here which Taylor and I discussed off-list
and which led to this comment. We quite often write switch statements
over enums like this:

  switch (foo)
  case FOO_ONE:
	...do something...
  case FOO_TWO:
        ...something else...
  default:
	BUG("I don't know what to do with %d", foo);
  }

That's reasonable and does the right thing at runtime if we ever hit
this case. But it has the unfortunate side effect that we lose any
-Wswitch warning that could tell us at compile time that we're missing a
case. Not everybody would see such a warning, as you note, but
developers on gcc and clang generally would (it's part of -Wall).

But we can't just remove the default case. Even though enums don't
generally take on other values, it's legal for them to do so. So we do
want to make sure we BUG() in that instance.

This is awkward to solve in the general case[1]. But because we're
returning in each case arm here, it's easy to just put the BUG() after
the switch. Anything that didn't return is unhandled, and we get the
best of both: -Wswitch warnings when we need to add a new filter type,
and a BUG() in the off chance that we see an unexpected value.

But the cost is that we have to enumerate the set of values that are
defined but not handled here (LOFC__COUNT, for instance, isn't a real
enum value but rather a placeholder to let other code know how many
filter types there are).

So...I dunno. Worth it as a general technique?

-Peff

[1] In the general case where you don't return, you have to somehow know
    whether the value was actually handled or not (and BUG() if it
    wasn't). Presumably by keeping a separate flag variable, which is
    pretty ugly. -Wswitch-enum is supposed to deal with this by
    requiring that you list all of the values even if you have a default
    case. But it triggers in a lot of other places in the code that I
    think would be made much harder to read by having to list out the
    enumerated possibilities.
Junio C Hamano March 18, 2020, 7:40 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

> But the cost is that we have to enumerate the set of values that are
> defined but not handled here (LOFC__COUNT, for instance, isn't a real
> enum value but rather a placeholder to let other code know how many
> filter types there are).
>
> So...I dunno. Worth it as a general technique?

"This is a possible value in the enum we are switching on, so I
write a case arm for it, but we do nothing for it here" is OK, but
if it were "we do nothing for it here or anywhere" (i.e. the maximum
enum value defined as a sentinel), the resulting code would be ugly.

I am not sure if the tradeoff is good to force such an ugliness on
readers' eyes to squelch the -Wswitch warnings.

So, I dunno.
Taylor Blau March 18, 2020, 9:05 p.m. UTC | #4
Hi Eric,

On Tue, Mar 17, 2020 at 04:53:44PM -0400, Eric Sunshine wrote:
> On Tue, Mar 17, 2020 at 4:40 PM Taylor Blau <me@ttaylorr.com> wrote:
> > In a subsequent commit, we will add configuration options that are
> > specific to each kind of object filter, in which case it is handy to
> > have a function that translates between 'enum
> > list_objects_filter_choice' and an appropriate configuration-friendly
> > string.
> > ---
>
> Missing sign-off (but perhaps that's intentional since this is RFC).

Yes, the missing sign-off (in this patch as well as 2/2) is intentional,
since this is an RFC. Sorry for not calling this out more clearly in my
cover.

Thanks,
Taylor
Eric Sunshine March 18, 2020, 10:38 p.m. UTC | #5
On Wed, Mar 18, 2020 at 6:03 AM Jeff King <peff@peff.net> wrote:
> On Tue, Mar 17, 2020 at 04:53:44PM -0400, Eric Sunshine wrote:
> > > +       case LOFC__COUNT:
> > > +               /*
> > > +                * Include these to catch all enumerated values, but
> > > +                * break to treat them as a bug. Any new values of this
> > > +                * enum will cause a compiler error, as desired.
> > > +                */
> >
> > In general, people will see a warning, not an error, unless they
> > specifically use -Werror (or such) to turn the warning into an error,
> > so this statement is misleading. Also, while some compilers may
> > complain, others may not. So, although the comment claims that we will
> > notice an unhandled enum constant at compile-time, that isn't
> > necessarily the case.
>
> Yes, but that's the best we can do, isn't it?

To be clear, I wasn't questioning the code structure at all. I was
specifically referring to the comment talking about "error" when it
should say "warning" or "possible warning".

Moreover, normally, we use comments to highlight something in the code
which is not obvious or straightforward, so I was questioning whether
this comment is even helpful since the code seems reasonably clear.
And...

> But we can't just remove the default case. Even though enums don't
> generally take on other values, it's legal for them to do so. So we do
> want to make sure we BUG() in that instance.
>
> This is awkward to solve in the general case[1]. But because we're
> returning in each case arm here, it's easy to just put the BUG() after
> the switch. Anything that didn't return is unhandled, and we get the
> best of both: -Wswitch warnings when we need to add a new filter type,
> and a BUG() in the off chance that we see an unexpected value.
>
> So...I dunno. Worth it as a general technique?

...if this is or will become an idiom we want in this codebase, then
it would be silly to write an explanatory comment every place we
employ it. Instead, a document such as CodingGuidelines would likely
be a better fit for such knowledge.
Jeff King March 19, 2020, 5:15 p.m. UTC | #6
On Wed, Mar 18, 2020 at 06:38:49PM -0400, Eric Sunshine wrote:

> To be clear, I wasn't questioning the code structure at all. I was
> specifically referring to the comment talking about "error" when it
> should say "warning" or "possible warning".
> 
> Moreover, normally, we use comments to highlight something in the code
> which is not obvious or straightforward, so I was questioning whether
> this comment is even helpful since the code seems reasonably clear.
> And...

OK, I agree with all that. :)

> ...if this is or will become an idiom we want in this codebase, then
> it would be silly to write an explanatory comment every place we
> employ it. Instead, a document such as CodingGuidelines would likely
> be a better fit for such knowledge.

Yeah, that makes sense. If we do use this technique, though, we'll have
to explicitly list "case" lines for the enum values which are meant to
break out to the BUG(). And there it _is_ worth commenting on "yes, we
know about this value but it is not handled here because...". Which is
what you asked for in your original message. :)

Something like:

  switch (c) {
  case LOFC_BLOB_NONE:
	return "blob:none":
  ..etc...
  case LOFC__COUNT:
	/* not a real filter type; just a marker for counting the number */
	break;
  case LOFC_DISABLED:
	/* we have no name for "no filter at all" */
	break;
  }
  BUG(...);

-Peff
diff mbox series

Patch

diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 256bcfbdfe..6b6aa0b3ec 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -15,6 +15,31 @@  static int parse_combine_filter(
 	const char *arg,
 	struct strbuf *errbuf);
 
+const char *list_object_filter_config_name(enum list_objects_filter_choice c)
+{
+	switch (c) {
+	case LOFC_BLOB_NONE:
+		return "blob:none";
+	case LOFC_BLOB_LIMIT:
+		return "blob:limit";
+	case LOFC_TREE_DEPTH:
+		return "tree:depth";
+	case LOFC_SPARSE_OID:
+		return "sparse:oid";
+	case LOFC_COMBINE:
+		return "combine";
+	case LOFC_DISABLED:
+	case LOFC__COUNT:
+		/*
+		 * Include these to catch all enumerated values, but
+		 * break to treat them as a bug. Any new values of this
+		 * enum will cause a compiler error, as desired.
+		 */
+		break;
+	}
+	BUG("list_object_filter_choice_name: invalid argument '%d'", c);
+}
+
 /*
  * Parse value of the argument to the "filter" keyword.
  * On the command line this looks like:
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index 2ffb39222c..e5259e4ac6 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -17,6 +17,12 @@  enum list_objects_filter_choice {
 	LOFC__COUNT /* must be last */
 };
 
+/*
+ * Returns a configuration key suitable for describing the given object filter,
+ * e.g.: "blob:none", "combine", etc.
+ */
+const char *list_object_filter_config_name(enum list_objects_filter_choice c);
+
 struct list_objects_filter_options {
 	/*
 	 * 'filter_spec' is the raw argument value given on the command line