diff mbox series

builtin/mv.c: fix possible segfault in add_slash()

Message ID 20220908230223.239970-1-shaoxuan.yuan02@gmail.com (mailing list archive)
State Superseded
Headers show
Series builtin/mv.c: fix possible segfault in add_slash() | expand

Commit Message

Shaoxuan Yuan Sept. 8, 2022, 11:02 p.m. UTC
A possible segfault was introduced in c08830de41 (mv: check if
<destination> is a SKIP_WORKTREE_DIR, 2022-08-09).

When running t7001 with SANITIZE=address, problem appears when running:

	git mv path1/path2/ .
or
	git mv directory ../
or
	any <destination> that makes dest_path[0] an empty string.

The add_slash() call segfaults when dest_path[0] is an empty string,
because it was accessing a null value in such case.

Change add_slash() to check the path argument is a non-empty string
before accessing its value.

The purpose of add_slash() is adding a slash to the end of a string to
construct a directory path. And, because adding a slash to an empty
string is of no use here, and checking the string value without checking
it is non-empty leads to segfault, we should make sure the length of the
string is positive to solve both problems.

Reported-by: Jeff King <peff@peff.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
Reference: https://lore.kernel.org/git/YwdJRRuST2SP8ZT7@coredump.intra.peff.net/

 builtin/mv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


base-commit: e71f9b1de6941c8b449d0c0e17e457f999664bc9

Comments

Jeff King Sept. 9, 2022, 2:21 a.m. UTC | #1
On Thu, Sep 08, 2022 at 04:02:23PM -0700, Shaoxuan Yuan wrote:

> The purpose of add_slash() is adding a slash to the end of a string to
> construct a directory path. And, because adding a slash to an empty
> string is of no use here, and checking the string value without checking
> it is non-empty leads to segfault, we should make sure the length of the
> string is positive to solve both problems.

Thanks for picking this up. I had forgotten about it.

The patch looks obviously fine to me from the perspective of stopping
the segfault. I'll take your "of no use here" as a given, not being
familiar with the subtleties of mv's path handling. :) Assuming that's
correct, then everything looks good to me.

-Peff
Derrick Stolee Sept. 9, 2022, 2:14 p.m. UTC | #2
On 9/8/2022 7:02 PM, Shaoxuan Yuan wrote:
> A possible segfault was introduced in c08830de41 (mv: check if
> <destination> is a SKIP_WORKTREE_DIR, 2022-08-09).
> 
> When running t7001 with SANITIZE=address, problem appears when running:
> 
> 	git mv path1/path2/ .
> or
> 	git mv directory ../
> or
> 	any <destination> that makes dest_path[0] an empty string.
> 
> The add_slash() call segfaults when dest_path[0] is an empty string,
> because it was accessing a null value in such case.

It doesn't _always_ seg fault, since we have tests that cover this
case. Adding this change will cause t7001-mv.sh to start failing
in many places:

diff --git a/builtin/mv.c b/builtin/mv.c
index 2d64c1e80fe..8216680ad3c 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -71,6 +71,10 @@ static const char **internal_prefix_pathspec(const char *prefix,
 static const char *add_slash(const char *path)
 {
 	size_t len = strlen(path);
+
+	if (!len)
+		die("segfault?");
+
 	if (path[len - 1] != '/') {
 		char *with_slash = xmalloc(st_add(len, 2));
 		memcpy(with_slash, path, len);

I suppose it is better to say "could segfault". Running the test
under --valgrind also causes a failure. It covers both cases, "."
and "../".

This is all to say that there is some subtlety to the situation, which
helps justify the lack of a new test case (the tests cover this case,
but require extra steps to show a failure).

> Change add_slash() to check the path argument is a non-empty string
> before accessing its value.
> 
> The purpose of add_slash() is adding a slash to the end of a string to
> construct a directory path. And, because adding a slash to an empty
> string is of no use here, and checking the string value without checking
> it is non-empty leads to segfault, we should make sure the length of the
> string is positive to solve both problems.

I agree that the code change is correct.

Thanks,
-Stolee
Junio C Hamano Sept. 9, 2022, 4:37 p.m. UTC | #3
Derrick Stolee <derrickstolee@github.com> writes:

> On 9/8/2022 7:02 PM, Shaoxuan Yuan wrote:
>> A possible segfault was introduced in c08830de41 (mv: check if
>> <destination> is a SKIP_WORKTREE_DIR, 2022-08-09).
>> 
>> When running t7001 with SANITIZE=address, problem appears when running:
>> 
>> 	git mv path1/path2/ .
>> or
>> 	git mv directory ../
>> or
>> 	any <destination> that makes dest_path[0] an empty string.
>> 
>> The add_slash() call segfaults when dest_path[0] is an empty string,
>> because it was accessing a null value in such case.
>
> It doesn't _always_ seg fault, since we have tests that cover this
> case. Adding this change will cause t7001-mv.sh to start failing
> in many places:
>
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 2d64c1e80fe..8216680ad3c 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -71,6 +71,10 @@ static const char **internal_prefix_pathspec(const char *prefix,
>  static const char *add_slash(const char *path)
>  {
>  	size_t len = strlen(path);
> +
> +	if (!len)
> +		die("segfault?");
> +
>  	if (path[len - 1] != '/') {
>  		char *with_slash = xmalloc(st_add(len, 2));
>  		memcpy(with_slash, path, len);
>
> I suppose it is better to say "could segfault". Running the test
> under --valgrind also causes a failure. It covers both cases, "."
> and "../".

While "could segfault" is of course more correct, I do not see a
huge difference here, but that is only because I learned to equate
"segfaults" in our log messages with "makes an access to
inappropriate memory address".

If I were to suggest updating the proposed log message, I would
rather spend a bit more bytes to explain what callers expect
add_slash() to do, why they call the helper for.  It would make it
obvious why it is the right behaviour the callers expect for the
function to return an empty string as-is.

I _think_ the reason is that the caller of add_slash has the name of
a directory in the working tree (relative to the root of the working
tree) and wants to add strings to form pathnames to things in the
directory.  They have "Documentation" directory and are told to move
"Makefile" from somewhere into it, so they pass "Documentation",
want "Documentation/" back, to form "Documentation/Makefile" by
concatenating.  If they are told to move something to the toplevel,
the target would be originally given as "." and while driving the
machinery to rename something to "./Makefile" might also work,
because the pathnames are normalized fairly early by removing excess
dots and resolving double-dots, the actual 'path' passed to
add_slash() by the caller in this case is an empty string, not a
single dot.  And "move this Makefile sitting somewhere else to ."
means "the path to the resulting file is Makefile" (as opposed to
"the path to the resulting file is ./Makefile"), which is correct.

Of course, I expect the log message to explain it a lot more
concisely, instead of spending more than a dozen lines ;-)

Thanks.
diff mbox series

Patch

diff --git a/builtin/mv.c b/builtin/mv.c
index 2d64c1e80f..3413ad1c9b 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -71,7 +71,7 @@  static const char **internal_prefix_pathspec(const char *prefix,
 static const char *add_slash(const char *path)
 {
 	size_t len = strlen(path);
-	if (path[len - 1] != '/') {
+	if (len && path[len - 1] != '/') {
 		char *with_slash = xmalloc(st_add(len, 2));
 		memcpy(with_slash, path, len);
 		with_slash[len++] = '/';