diff mbox series

[v3,7/7] parse-options: introduce bounded integer options

Message ID 20250416-b4-pks-parse-options-integers-v3-7-d390746bea79@pks.im (mailing list archive)
State Superseded
Headers show
Series parse-options: harden handling of integer values | expand

Commit Message

Patrick Steinhardt April 16, 2025, 10:02 a.m. UTC
In the preceding commits we have introduced integer precisions. The
precision merely tracks bounds of the underlying data types so that we
don't try to for example write a `size_t` into an `unsigned`, which
could otherwise cause out-of-bounds writes.

Some options may have bounds that are stricter than the underlying data
type. Right now, users of any such options would have to manually verify
that the value passed to such an option is inside the expected bounds.
This is rather tedious, and it leads to code duplication across sites
that wish to perform such bounds checks.

Introduce `OPT_*_BOUNDED()` options that alleviate this issue. Users
can optionally specify both a lower and upper bound, and if set we will
verify that the value passed by the user is in that range.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 parse-options.c               | 40 ++++++++++++++++++++++++++++-----
 parse-options.h               | 52 +++++++++++++++++++++++++++++++++++++++++++
 t/helper/test-parse-options.c |  5 +++++
 t/t0040-parse-options.sh      | 33 +++++++++++++++++++++++++++
 4 files changed, 125 insertions(+), 5 deletions(-)

Comments

Junio C Hamano April 16, 2025, 7:19 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> In the preceding commits we have introduced integer precisions. The
> precision merely tracks bounds of the underlying data types so that we
> don't try to for example write a `size_t` into an `unsigned`, which
> could otherwise cause out-of-bounds writes.
>
> Some options may have bounds that are stricter than the underlying data
> type. Right now, users of any such options would have to manually verify
> that the value passed to such an option is inside the expected bounds.
> This is rather tedious, and it leads to code duplication across sites
> that wish to perform such bounds checks.
>
> Introduce `OPT_*_BOUNDED()` options that alleviate this issue. Users
> can optionally specify both a lower and upper bound, and if set we will
> verify that the value passed by the user is in that range.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  parse-options.c               | 40 ++++++++++++++++++++++++++++-----
>  parse-options.h               | 52 +++++++++++++++++++++++++++++++++++++++++++
>  t/helper/test-parse-options.c |  5 +++++
>  t/t0040-parse-options.sh      | 33 +++++++++++++++++++++++++++
>  4 files changed, 125 insertions(+), 5 deletions(-)

It is certainly cute, but unless there are plenty of existing users
that use OPT_INTEGER() and friends and perform bounds checks
themselves, I am not sure if this can withstand YAGNI criticism.
And this step being at the end of the series, plus the above
diffstat, tells us that there aren't any existing users converted to
use this new mechanism.

OPT_INTEGER that wants to track percentage may want to say the value
is between 0 and 100 (inclusive), but instead we take it bounded not
to exceed 100, without lower bound.  Without a real callsite, we
cannot even tell if it is acceptable compromise for the sake of
simplicity to forbid 0 as lower or upper bound, for example.

Thanks.
Patrick Steinhardt April 17, 2025, 8:14 a.m. UTC | #2
On Wed, Apr 16, 2025 at 12:19:31PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > In the preceding commits we have introduced integer precisions. The
> > precision merely tracks bounds of the underlying data types so that we
> > don't try to for example write a `size_t` into an `unsigned`, which
> > could otherwise cause out-of-bounds writes.
> >
> > Some options may have bounds that are stricter than the underlying data
> > type. Right now, users of any such options would have to manually verify
> > that the value passed to such an option is inside the expected bounds.
> > This is rather tedious, and it leads to code duplication across sites
> > that wish to perform such bounds checks.
> >
> > Introduce `OPT_*_BOUNDED()` options that alleviate this issue. Users
> > can optionally specify both a lower and upper bound, and if set we will
> > verify that the value passed by the user is in that range.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  parse-options.c               | 40 ++++++++++++++++++++++++++++-----
> >  parse-options.h               | 52 +++++++++++++++++++++++++++++++++++++++++++
> >  t/helper/test-parse-options.c |  5 +++++
> >  t/t0040-parse-options.sh      | 33 +++++++++++++++++++++++++++
> >  4 files changed, 125 insertions(+), 5 deletions(-)
> 
> It is certainly cute, but unless there are plenty of existing users
> that use OPT_INTEGER() and friends and perform bounds checks
> themselves, I am not sure if this can withstand YAGNI criticism.
> And this step being at the end of the series, plus the above
> diffstat, tells us that there aren't any existing users converted to
> use this new mechanism.

Yeah, that was also a bit of my feeling. I was on the lookout for
callsites, but I ultimately didn't find too many. Which is basically the
reason why I said that this patch is more of a PoC, and that I'm happy
to drop it again.

> OPT_INTEGER that wants to track percentage may want to say the value
> is between 0 and 100 (inclusive), but instead we take it bounded not
> to exceed 100, without lower bound.  Without a real callsite, we
> cannot even tell if it is acceptable compromise for the sake of
> simplicity to forbid 0 as lower or upper bound, for example.

Yes, `0` meaning "default" is restricting us here. But my counter
argument is that a value that can only be between `0` and `100` should
use `OPT_UNSIGNED` in the first place, which allows us to achieve
exactly that.

Let's just drop this patch for now. It was only a PoC anyway, and we can
use it as inpiration if we ever see that this feature is something we
want.

Patrick
diff mbox series

Patch

diff --git a/parse-options.c b/parse-options.c
index e4dc22464b2..d1dffcfdf5f 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -177,6 +177,20 @@  static enum parse_opt_result do_get_value(struct parse_opt_ctx_t *p,
 		intmax_t lower_bound = -upper_bound - 1;
 		intmax_t value;
 
+		if (opt->lower_bound) {
+			if (opt->lower_bound < lower_bound)
+				BUG("invalid lower bound for option %s", optname(opt, flags));
+			if (opt->lower_bound > lower_bound)
+				lower_bound = opt->lower_bound;
+		}
+
+		if (opt->upper_bound) {
+			if (opt->upper_bound > (uintmax_t)upper_bound)
+				BUG("invalid upper bound for option %s", optname(opt, flags));
+			if (opt->upper_bound < (uintmax_t)upper_bound)
+				upper_bound = opt->upper_bound;
+		}
+
 		if (unset) {
 			value = 0;
 		} else if (opt->flags & PARSE_OPT_OPTARG && !p->opt) {
@@ -225,8 +239,16 @@  static enum parse_opt_result do_get_value(struct parse_opt_ctx_t *p,
 	case OPTION_UNSIGNED:
 	{
 		uintmax_t upper_bound = UINTMAX_MAX >> (bitsizeof(uintmax_t) - CHAR_BIT * opt->precision);
+		uintmax_t lower_bound = 0;
 		uintmax_t value;
 
+		if (opt->lower_bound < 0)
+			BUG("invalid lower bound for option %s", optname(opt, flags));
+		if (opt->lower_bound > 0)
+			lower_bound = opt->lower_bound;
+		if (opt->upper_bound && opt->upper_bound < upper_bound)
+			upper_bound = opt->upper_bound;
+
 		if (unset) {
 			value = 0;
 		} else if (opt->flags & PARSE_OPT_OPTARG && !p->opt) {
@@ -247,16 +269,16 @@  static enum parse_opt_result do_get_value(struct parse_opt_ctx_t *p,
 					     optname(opt, flags));
 			if (errno == ERANGE)
 				return error(_("value %s for %s not in range [%"PRIuMAX",%"PRIuMAX"]"),
-					     arg, optname(opt, flags), (uintmax_t)0, (uintmax_t)upper_bound);
+					     arg, optname(opt, flags), (uintmax_t)lower_bound, (uintmax_t)upper_bound);
 			if (errno)
 				return error_errno(_("value %s for %s cannot be parsed"),
 						   arg, optname(opt, flags));
 
 		}
 
-		if (value > upper_bound)
+		if (value < lower_bound || value > upper_bound)
 			return error(_("value %s for %s not in range [%"PRIuMAX",%"PRIuMAX"]"),
-				     arg, optname(opt, flags), (uintmax_t)0, (uintmax_t)upper_bound);
+				     arg, optname(opt, flags), (uintmax_t)lower_bound, (uintmax_t)upper_bound);
 
 		switch (opt->precision) {
 		case 1:
@@ -279,8 +301,16 @@  static enum parse_opt_result do_get_value(struct parse_opt_ctx_t *p,
 	case OPTION_MAGNITUDE:
 	{
 		uintmax_t upper_bound = UINTMAX_MAX >> (bitsizeof(uintmax_t) - CHAR_BIT * opt->precision);
+		uintmax_t lower_bound = 0;
 		unsigned long value;
 
+		if (opt->lower_bound < 0)
+			BUG("invalid lower bound for option %s", optname(opt, flags));
+		if (opt->lower_bound > 0)
+			lower_bound = opt->lower_bound;
+		if (opt->upper_bound && opt->upper_bound < upper_bound)
+			upper_bound = opt->upper_bound;
+
 		if (unset) {
 			value = 0;
 		} else if (opt->flags & PARSE_OPT_OPTARG && !p->opt) {
@@ -293,9 +323,9 @@  static enum parse_opt_result do_get_value(struct parse_opt_ctx_t *p,
 				     optname(opt, flags));
 		}
 
-		if (value > upper_bound)
+		if (value < lower_bound || value > upper_bound)
 			return error(_("value %s for %s not in range [%"PRIuMAX",%"PRIuMAX"]"),
-				     arg, optname(opt, flags), (uintmax_t)0, (uintmax_t)upper_bound);
+				     arg, optname(opt, flags), (uintmax_t)lower_bound, (uintmax_t)upper_bound);
 
 		switch (opt->precision) {
 		case 1:
diff --git a/parse-options.h b/parse-options.h
index 168df642386..c1ebdaf7639 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -97,6 +97,13 @@  typedef int parse_opt_subcommand_fn(int argc, const char **argv,
  *   precision of the integer pointed to by `value` in number of bytes. Should
  *   typically be its `sizeof()`.
  *
+ * `lower_bound`,`upper_bound`::
+ *   lower and upper bound of the integer to further restrict the accepted
+ *   range of integer values. `0` will use the minimum and maximum values for
+ *   the integer type of the specified precision. Specifying a bound that does
+ *   not fit into an integer type of the specified precision will trigger a
+ *   bug.
+ *
  * `argh`::
  *   token to explain the kind of argument this option wants. Does not
  *   begin in capital letter, and does not end with a full stop.
@@ -157,6 +164,8 @@  struct option {
 	const char *long_name;
 	void *value;
 	size_t precision;
+	intmax_t lower_bound;
+	uintmax_t upper_bound;
 	const char *argh;
 	const char *help;
 
@@ -225,6 +234,19 @@  struct option {
 	.help = (h), \
 	.flags = (f), \
 }
+#define OPT_INTEGER_BOUNDED_F(s, l, v, lower, upper, h, f) { \
+	.type = OPTION_INTEGER, \
+	.short_name = (s), \
+	.long_name = (l), \
+	.value = (v) + BARF_UNLESS_SIGNED(*(v)), \
+	.precision = sizeof(*v), \
+	.lower_bound = (lower), \
+	.upper_bound = (upper), \
+	.argh = N_("n"), \
+	.help = (h), \
+	.flags = (f), \
+}
+
 #define OPT_UNSIGNED_F(s, l, v, h, f) { \
 	.type = OPTION_UNSIGNED, \
 	.short_name = (s), \
@@ -235,6 +257,18 @@  struct option {
 	.help = (h), \
 	.flags = (f), \
 }
+#define OPT_UNSIGNED_BOUNDED_F(s, l, v, lower, upper, h, f) { \
+	.type = OPTION_UNSIGNED, \
+	.short_name = (s), \
+	.long_name = (l), \
+	.value = (v) + BARF_UNLESS_UNSIGNED(*(v)), \
+	.precision = sizeof(*v), \
+	.lower_bound = (lower), \
+	.upper_bound = (upper), \
+	.argh = N_("n"), \
+	.help = (h), \
+	.flags = (f), \
+}
 
 #define OPT_END() { \
 	.type = OPTION_END, \
@@ -287,7 +321,12 @@  struct option {
 #define OPT_CMDMODE(s, l, v, h, i)  OPT_CMDMODE_F(s, l, v, h, i, 0)
 
 #define OPT_INTEGER(s, l, v, h)     OPT_INTEGER_F(s, l, v, h, 0)
+#define OPT_INTEGER_BOUNDED(s, l, v, lower, upper, h) \
+	OPT_INTEGER_BOUNDED_F(s, l, v, lower, upper, h, 0)
 #define OPT_UNSIGNED(s, l, v, h)    OPT_UNSIGNED_F(s, l, v, h, 0)
+#define OPT_UNSIGNED_BOUNDED(s, l, v, lower, upper, h) \
+	OPT_UNSIGNED_BOUNDED_F(s, l, v, lower, upper, h, 0)
+
 #define OPT_MAGNITUDE(s, l, v, h) { \
 	.type = OPTION_MAGNITUDE, \
 	.short_name = (s), \
@@ -298,6 +337,19 @@  struct option {
 	.help = (h), \
 	.flags = PARSE_OPT_NONEG, \
 }
+#define OPT_MAGNITUDE_BOUNDED(s, l, v, lower, upper, h) { \
+	.type = OPTION_MAGNITUDE, \
+	.short_name = (s), \
+	.long_name = (l), \
+	.value = (v) + BARF_UNLESS_UNSIGNED(*(v)), \
+	.precision = sizeof(*v), \
+	.lower_bound = (lower), \
+	.upper_bound = (upper), \
+	.argh = N_("n"), \
+	.help = (h), \
+	.flags = PARSE_OPT_NONEG, \
+}
+
 #define OPT_STRING(s, l, v, a, h)   OPT_STRING_F(s, l, v, a, h, 0)
 #define OPT_STRING_LIST(s, l, v, a, h) { \
 	.type = OPTION_CALLBACK, \
diff --git a/t/helper/test-parse-options.c b/t/helper/test-parse-options.c
index 0d559288d9c..0fcceec56a7 100644
--- a/t/helper/test-parse-options.c
+++ b/t/helper/test-parse-options.c
@@ -120,7 +120,9 @@  int cmd__parse_options(int argc, const char **argv)
 	};
 	struct string_list expect = STRING_LIST_INIT_NODUP;
 	struct string_list list = STRING_LIST_INIT_NODUP;
+	uint32_t mbounded = 0, ubounded = 0;
 	uint16_t m16 = 0, u16 = 0;
+	int32_t ibounded = 0;
 	int16_t i16 = 0;
 
 	struct option options[] = {
@@ -142,10 +144,13 @@  int cmd__parse_options(int argc, const char **argv)
 		OPT_GROUP(""),
 		OPT_INTEGER('i', "integer", &integer, "get a integer"),
 		OPT_INTEGER(0, "i16", &i16, "get a 16 bit integer"),
+		OPT_INTEGER_BOUNDED(0, "ibounded", &ibounded, -10, 10, "get a bounded integer between [-10,10]"),
 		OPT_UNSIGNED(0, "u16", &u16, "get a 16 bit unsigned integer"),
+		OPT_UNSIGNED_BOUNDED(0, "ubounded", &ubounded, 10, 100, "get a bounded unsigned integer between [10,100]"),
 		OPT_INTEGER('j', NULL, &integer, "get a integer, too"),
 		OPT_MAGNITUDE('m', "magnitude", &magnitude, "get a magnitude"),
 		OPT_MAGNITUDE(0, "m16", &m16, "get a 16 bit magnitude"),
+		OPT_MAGNITUDE_BOUNDED(0, "mbounded", &mbounded, 10, 100, "get a bounded magnitude between [10,100]"),
 		OPT_SET_INT(0, "set23", &integer, "set integer to 23", 23),
 		OPT_CMDMODE(0, "mode1", &integer, "set integer to 1 (cmdmode option)", 1),
 		OPT_CMDMODE(0, "mode2", &integer, "set integer to 2 (cmdmode option)", 2),
diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh
index 66875ce0586..d76165c2053 100755
--- a/t/t0040-parse-options.sh
+++ b/t/t0040-parse-options.sh
@@ -23,10 +23,13 @@  usage: test-tool parse-options <options>
     -i, --[no-]integer <n>
                           get a integer
     --[no-]i16 <n>        get a 16 bit integer
+    --[no-]ibounded <n>   get a bounded integer between [-10,10]
     --[no-]u16 <n>        get a 16 bit unsigned integer
+    --[no-]ubounded <n>   get a bounded unsigned integer between [10,100]
     -j <n>                get a integer, too
     -m, --magnitude <n>   get a magnitude
     --m16 <n>             get a 16 bit magnitude
+    --mbounded <n>        get a bounded magnitude between [10,100]
     --[no-]set23          set integer to 23
     --mode1               set integer to 1 (cmdmode option)
     --mode2               set integer to 2 (cmdmode option)
@@ -848,4 +851,34 @@  test_expect_success 'u16 does not accept negative value' '
 	test_must_be_empty out
 '
 
+test_expect_success 'ibounded does not accept outside range' '
+	test_must_fail test-tool parse-options --ibounded -11 >out 2>err &&
+	test_grep "value -11 for option .ibounded. not in range \[-10,10\]" err &&
+	test_must_fail test-tool parse-options --ibounded 11 >out 2>err &&
+	test_grep "value 11 for option .ibounded. not in range \[-10,10\]" err &&
+	test-tool parse-options --ibounded -10 &&
+	test-tool parse-options --ibounded 0 &&
+	test-tool parse-options --ibounded 10
+'
+
+test_expect_success 'ubounded does not accept outside range' '
+	test_must_fail test-tool parse-options --ubounded 9 >out 2>err &&
+	test_grep "value 9 for option .ubounded. not in range \[10,100\]" err &&
+	test_must_fail test-tool parse-options --ubounded 101 >out 2>err &&
+	test_grep "value 101 for option .ubounded. not in range \[10,100\]" err &&
+	test-tool parse-options --ubounded 10 &&
+	test-tool parse-options --ubounded 50 &&
+	test-tool parse-options --ubounded 100
+'
+
+test_expect_success 'mbounded does not accept outside range' '
+	test_must_fail test-tool parse-options --mbounded 9 >out 2>err &&
+	test_grep "value 9 for option .mbounded. not in range \[10,100\]" err &&
+	test_must_fail test-tool parse-options --mbounded 101 >out 2>err &&
+	test_grep "value 101 for option .mbounded. not in range \[10,100\]" err &&
+	test-tool parse-options --mbounded 10 &&
+	test-tool parse-options --mbounded 50 &&
+	test-tool parse-options --mbounded 100
+'
+
 test_done