diff mbox series

[v2] builtin/mv.c: fix possible segfault in add_slash()

Message ID 20220909194458.264735-1-shaoxuan.yuan02@gmail.com (mailing list archive)
State Superseded
Headers show
Series [v2] builtin/mv.c: fix possible segfault in add_slash() | expand

Commit Message

Shaoxuan Yuan Sept. 9, 2022, 7:44 p.m. UTC
A possible segfault was introduced in c08830de41 (mv: check if
<destination> is a SKIP_WORKTREE_DIR, 2022-08-09).

When running t7001 with SANITIZE=address, problem appears when running:

	git mv path1/path2/ .
or
	git mv directory ../
or
	any <destination> that makes dest_path[0] an empty string.

The add_slash() call could segfault when dest_path[0] is an empty string,
because it was accessing a null value in such case.

Change add_slash() to check the path argument is a non-empty string
before accessing its value. If the path is empty, return it as-is.

Explanation:

It's OK for add_slash() to return an empty string as-is. add_slash()
converts its path argument to the prefix (for "folder1/file1",
"folder1/" is the prefix we mean here) for the result path. The path
argument is an empty string _iff_ the result path is analyzed to be at
the top level (this normalization process is done earlier by
internal_prefix_pathspec()).

Because the prefix for a top-level path is an empty string, thus
add_slash() should return an empty path argument as-is, both for
correctness and avoiding inappropriate memory access.

Reported-by: Jeff King <peff@peff.net>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
Range-diff against v1:
1:  a5dccc030c ! 1:  82353f457d builtin/mv.c: fix possible segfault in add_slash()
    @@ Commit message
         or
                 any <destination> that makes dest_path[0] an empty string.
     
    -    The add_slash() call segfaults when dest_path[0] is an empty string,
    +    The add_slash() call could segfault when dest_path[0] is an empty string,
         because it was accessing a null value in such case.
     
         Change add_slash() to check the path argument is a non-empty string
    -    before accessing its value.
    +    before accessing its value. If the path is empty, return it as-is.
     
    -    The purpose of add_slash() is adding a slash to the end of a string to
    -    construct a directory path. And, because adding a slash to an empty
    -    string is of no use here, and checking the string value without checking
    -    it is non-empty leads to segfault, we should make sure the length of the
    -    string is positive to solve both problems.
    +    Explanation:
    +
    +    It's OK for add_slash() to return an empty string as-is. add_slash()
    +    converts its path argument to the prefix (for "folder1/file1",
    +    "folder1/" is the prefix we mean here) for the result path. The path
    +    argument is an empty string _iff_ the result path is analyzed to be at
    +    the top level (this normalization process is done earlier by
    +    internal_prefix_pathspec()).
    +
    +    Because the prefix for a top-level path is an empty string, thus
    +    add_slash() should return an empty path argument as-is, both for
    +    correctness and avoiding inappropriate memory access.
     
         Reported-by: Jeff King <peff@peff.net>
         Helped-by: Jeff King <peff@peff.net>
    +    Helped-by: Junio C Hamano <gitster@pobox.com>
    +    Helped-by: Derrick Stolee <derrickstolee@github.com>
         Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
      ## builtin/mv.c ##

 builtin/mv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


base-commit: e71f9b1de6941c8b449d0c0e17e457f999664bc9

Comments

Junio C Hamano Sept. 9, 2022, 8:04 p.m. UTC | #1
Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> writes:

> A possible segfault was introduced in c08830de41 (mv: check if
> <destination> is a SKIP_WORKTREE_DIR, 2022-08-09).
>
> When running t7001 with SANITIZE=address, problem appears when running:
>
> 	git mv path1/path2/ .
> or
> 	git mv directory ../
> or
> 	any <destination> that makes dest_path[0] an empty string.
>
> The add_slash() call could segfault when dest_path[0] is an empty string,
> because it was accessing a null value in such case.

Terminology.  The relevant preimage is

>  	size_t len = strlen(path);
> -	if (path[len - 1] != '/') {

An access to path[-1] is an out-of-bounds access.

> Change add_slash() to check the path argument is a non-empty string
> before accessing its value. If the path is empty, return it as-is.

That is not wrong per-se, but...

> Explanation:

... you'd need this funny label here.  If this is where your
explanation begins, what was the reader reading before it? ;-)

The logic would flow more naturally if you added your "explanation"
material between "what is wrong in the current code" and "what to do
to fix it", perhaps like so:

	... could segfault when path argument to it is an empty
	string, because it makes an out-of-bounds read to decide if
	an extra slash '/' needs to be appended to it.

	As add_slash() is used to make sure that a valid pathname to
	a file in the given directory can be made by appending a
	filename after the value returned from it, if path is an
	empty string, we want to return it as-is.  The path to a
	file "F" in the top-level of the working tree (i.e.
	path=="") is formed by appending "F" after "" (i.e. path)
	without any slash in between.

	So, just like the case where a non-empty path already ends
	with a slash, return an empty path as-is.


> diff --git a/builtin/mv.c b/builtin/mv.c
> index 2d64c1e80f..3413ad1c9b 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -71,7 +71,7 @@ static const char **internal_prefix_pathspec(const char *prefix,
>  static const char *add_slash(const char *path)
>  {
>  	size_t len = strlen(path);
> -	if (path[len - 1] != '/') {
> +	if (len && path[len - 1] != '/') {
>  		char *with_slash = xmalloc(st_add(len, 2));
>  		memcpy(with_slash, path, len);
>  		with_slash[len++] = '/';

Yup.  It cannot be seen in the patch but the post-context of this
hunk just returns path as-is, which is what we want to happen.

Thanks.
Shaoxuan Yuan Sept. 9, 2022, 10:40 p.m. UTC | #2
On 9/9/2022 1:04 PM, Junio C Hamano wrote:
> Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> writes:
> 
>> A possible segfault was introduced in c08830de41 (mv: check if
>> <destination> is a SKIP_WORKTREE_DIR, 2022-08-09).
>>
>> When running t7001 with SANITIZE=address, problem appears when running:
>>
>> 	git mv path1/path2/ .
>> or
>> 	git mv directory ../
>> or
>> 	any <destination> that makes dest_path[0] an empty string.
>>
>> The add_slash() call could segfault when dest_path[0] is an empty string,
>> because it was accessing a null value in such case.
> 
> Terminology.  The relevant preimage is
> 
>>  	size_t len = strlen(path);
>> -	if (path[len - 1] != '/') {
> 
> An access to path[-1] is an out-of-bounds access.

Thanks for the term, new thing learned :-)

>> Change add_slash() to check the path argument is a non-empty string
>> before accessing its value. If the path is empty, return it as-is.
> 
> That is not wrong per-se, but...
> 
>> Explanation:
> 
> ... you'd need this funny label here.  If this is where your
> explanation begins, what was the reader reading before it? ;-)
> 
> The logic would flow more naturally if you added your "explanation"
> material between "what is wrong in the current code" and "what to do
> to fix it", perhaps like so:

Indeed, explanation before action sounds more reasonable.

> 	... could segfault when path argument to it is an empty
> 	string, because it makes an out-of-bounds read to decide if
> 	an extra slash '/' needs to be appended to it.
> 
> 	As add_slash() is used to make sure that a valid pathname to
> 	a file in the given directory can be made by appending a
> 	filename after the value returned from it, if path is an
> 	empty string, we want to return it as-is.  The path to a
> 	file "F" in the top-level of the working tree (i.e.
> 	path=="") is formed by appending "F" after "" (i.e. path)
> 	without any slash in between.
> 
> 	So, just like the case where a non-empty path already ends
> 	with a slash, return an empty path as-is.
> 

Thanks for the paraphrase, I put it in the v3 just sent.

>> diff --git a/builtin/mv.c b/builtin/mv.c
>> index 2d64c1e80f..3413ad1c9b 100644
>> --- a/builtin/mv.c
>> +++ b/builtin/mv.c
>> @@ -71,7 +71,7 @@ static const char **internal_prefix_pathspec(const char *prefix,
>>  static const char *add_slash(const char *path)
>>  {
>>  	size_t len = strlen(path);
>> -	if (path[len - 1] != '/') {
>> +	if (len && path[len - 1] != '/') {
>>  		char *with_slash = xmalloc(st_add(len, 2));
>>  		memcpy(with_slash, path, len);
>>  		with_slash[len++] = '/';
> 
> Yup.  It cannot be seen in the patch but the post-context of this
> hunk just returns path as-is, which is what we want to happen.

Yes.

Thanks,
Shaoxuan
diff mbox series

Patch

diff --git a/builtin/mv.c b/builtin/mv.c
index 2d64c1e80f..3413ad1c9b 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -71,7 +71,7 @@  static const char **internal_prefix_pathspec(const char *prefix,
 static const char *add_slash(const char *path)
 {
 	size_t len = strlen(path);
-	if (path[len - 1] != '/') {
+	if (len && path[len - 1] != '/') {
 		char *with_slash = xmalloc(st_add(len, 2));
 		memcpy(with_slash, path, len);
 		with_slash[len++] = '/';