diff mbox series

[2/2] patch-id: replace `atoi()` with `strtol_i2()`

Message ID 1ece724b1ca7f71542bfef42795fce798563ecde.1705913519.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Replace atoi() with strtol_i2() | expand

Commit Message

Mohit Marathe Jan. 22, 2024, 8:51 a.m. UTC
From: Mohit Marathe <mohitmarathe23@gmail.com>

The change is made to improve the error-handling capabilities
during the conversion of string representations to integers.
The `strtol_i2(` function offers a more robust mechanism for
converting strings to integers by providing enhanced error
detection. Unlike `atoi(`, `strtol_i2(` allows the code to
differentiate between a valid conversion and an invalid one,
offering better resilience against potential issues such as
reading hunk header of a corrupted patch.

Signed-off-by: Mohit Marathe <mohitmarathe@proton.me>
---
 builtin/patch-id.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Comments

Junio C Hamano Jan. 22, 2024, 7:32 p.m. UTC | #1
"Mohit Marathe via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  	static const char digits[] = "0123456789";
>  	const char *q, *r;
> +	char *endp;
>  	int n;
>  
>  	q = p + 4;
>  	n = strspn(q, digits);
>  	if (q[n] == ',') {
>  		q += n + 1;
> -		*p_before = atoi(q);
> +		if (strtol_i2(q, 10, p_before, &endp) != 0)
> +			return 0;
>  		n = strspn(q, digits);
>  	} else {
>  		*p_before = 1;
>  	}

Looking at this code again, because we upfront run strspn() to make
sure q[] begins with a run of digits *and* followed by a comma
(which is not a digit), I think it is safe to use atoi() and assume
it would slurp all the digits.  So the lack of another check the use
of new helper allows us to do, namely

		if (endp != q + n)
			return 0;

is probably OK, but that is one of the two reasons why you would
favor the use of new helper over atoi(), so the upside of this
change is not all that great as I originally hoped for X-<.

Not your fault, of course.  We would still catch when the digit
string that starts q[] is too large to fit in an int, which is an
upside.

> -	if (n == 0 || q[n] != ' ' || q[n+1] != '+')
> +	if (q[n] != ' ' || q[n+1] != '+')
>  		return 0;

When we saw q[] that begins with ',' upon entry to this function, we
used to say *p_before = 1 and then saw n==0 and realized it is not a
good input and returned 0 from the function.

Now we instead peek q[0] and the check says q[0] is not SP so we
will return 0 the same way so there is no behaviour change from the
upper hunk?  The conversion may be correct, but it wasn't explained
in the proposed commit log message.

How are the change to stop caring about n==0 here ...

>  	r = q + n + 2;
>  	n = strspn(r, digits);
>  	if (r[n] == ',') {
>  		r += n + 1;
> -		*p_after = atoi(r);
> -		n = strspn(r, digits);
> +		if (strtol_i2(r, 10, p_after, &endp) != 0)
> +			return 0;
>  	} else {
>  		*p_after = 1;
>  	}
> -	if (n == 0)
> -		return 0;

... and this change here, linked to the switch from atoi() to
strtul_i2()[*]?

It looks like an unrelated behaviour change that is left
unexplained.

>  	return 1;
>  }

Thanks for working on this one.


[Footnote]

 * by the way, what a horrible name for a public function.  Yuck.
Mohit Marathe Jan. 24, 2024, 4:22 a.m. UTC | #2
On Tuesday, January 23rd, 2024 at 1:02 AM, Junio C Hamano <gitster@pobox.com> wrote:

> "Mohit Marathe via GitGitGadget" gitgitgadget@gmail.com writes:
> 
> > static const char digits[] = "0123456789";
> > const char *q, *r;
> > + char *endp;
> > int n;
> > 
> > q = p + 4;
> > n = strspn(q, digits);
> > if (q[n] == ',') {
> > q += n + 1;
> > - *p_before = atoi(q);
> > + if (strtol_i2(q, 10, p_before, &endp) != 0)
> > + return 0;
> > n = strspn(q, digits);
> > } else {
> > *p_before = 1;
> > }
> 
> 
> Looking at this code again, because we upfront run strspn() to make
> sure q[] begins with a run of digits and followed by a comma
> (which is not a digit), I think it is safe to use atoi() and assume
> it would slurp all the digits. So the lack of another check the use
> of new helper allows us to do, namely
> 
> if (endp != q + n)
> return 0;
> 
> is probably OK, but that is one of the two reasons why you would
> favor the use of new helper over atoi(), so the upside of this
> change is not all that great as I originally hoped for X-<.
> 
> Not your fault, of course. We would still catch when the digit
> string that starts q[] is too large to fit in an int, which is an
> upside.
> 
> > - if (n == 0 || q[n] != ' ' || q[n+1] != '+')
> > + if (q[n] != ' ' || q[n+1] != '+')
> > return 0;
> 
> 
> When we saw q[] that begins with ',' upon entry to this function, we
> used to say *p_before = 1 and then saw n==0 and realized it is not a
> good input and returned 0 from the function.

Uh oh, I just looked at the `if` block and concluded that it was just 
to check if it has numbers after the ',', which`strtol_i2()` already 
does. But I totally missed this one. 

> Now we instead peek q[0] and the check says q[0] is not SP so we
> will return 0 the same way so there is no behaviour change from the
> upper hunk? The conversion may be correct, but it wasn't explained
> in the proposed commit log message.
> 
> How are the change to stop caring about n==0 here ...
> 
> > r = q + n + 2;
> > n = strspn(r, digits);
> > if (r[n] == ',') {
> > r += n + 1;
> > - *p_after = atoi(r);
> > - n = strspn(r, digits);
> > + if (strtol_i2(r, 10, p_after, &endp) != 0)
> > + return 0;
> > } else {
> > *p_after = 1;
> > }
> > - if (n == 0)
> > - return 0;
> 
> 
> ... and this change here, linked to the switch from atoi() to
> strtul_i2()[*]?
> 
> It looks like an unrelated behaviour change that is left
> unexplained.
> 
> > return 1;
> > }
> 
> 
> Thanks for working on this one.
> 
> 
> [Footnote]
> 
> * by the way, what a horrible name for a public function. Yuck.

Yeah, I thought so too /:D How does `strtol_i_updated` sounds?

Thanks for you feedback! I will send v2 with the corrections soon.
diff mbox series

Patch

diff --git a/builtin/patch-id.c b/builtin/patch-id.c
index 3894d2b9706..6bfb263de5e 100644
--- a/builtin/patch-id.c
+++ b/builtin/patch-id.c
@@ -1,3 +1,4 @@ 
+#include "git-compat-util.h"
 #include "builtin.h"
 #include "config.h"
 #include "diff.h"
@@ -29,33 +30,32 @@  static int scan_hunk_header(const char *p, int *p_before, int *p_after)
 {
 	static const char digits[] = "0123456789";
 	const char *q, *r;
+	char *endp;
 	int n;
 
 	q = p + 4;
 	n = strspn(q, digits);
 	if (q[n] == ',') {
 		q += n + 1;
-		*p_before = atoi(q);
+		if (strtol_i2(q, 10, p_before, &endp) != 0)
+			return 0;
 		n = strspn(q, digits);
 	} else {
 		*p_before = 1;
 	}
 
-	if (n == 0 || q[n] != ' ' || q[n+1] != '+')
+	if (q[n] != ' ' || q[n+1] != '+')
 		return 0;
 
 	r = q + n + 2;
 	n = strspn(r, digits);
 	if (r[n] == ',') {
 		r += n + 1;
-		*p_after = atoi(r);
-		n = strspn(r, digits);
+		if (strtol_i2(r, 10, p_after, &endp) != 0)
+			return 0;
 	} else {
 		*p_after = 1;
 	}
-	if (n == 0)
-		return 0;
-
 	return 1;
 }