diff mbox series

[v4,09/12] unix-socket: disallow chdir() when creating unix domain sockets

Message ID 1bfa36409d0706d5e22703f80bf95dfa1a313a83.1613598529.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit ca012e6fcc03384b640216c3e31502710c949f4a
Headers show
Series Simple IPC Mechanism | expand

Commit Message

Jeff Hostetler Feb. 17, 2021, 9:48 p.m. UTC
From: Jeff Hostetler <jeffhost@microsoft.com>

Calls to `chdir()` are dangerous in a multi-threaded context.  If
`unix_stream_listen()` or `unix_stream_connect()` is given a socket
pathname that is too long to fit in a `sockaddr_un` structure, it will
`chdir()` to the parent directory of the requested socket pathname,
create the socket using a relative pathname, and then `chdir()` back.
This is not thread-safe.

Teach `unix_sockaddr_init()` to not allow calls to `chdir()` when this
flag is set.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/credential-cache.c |  2 +-
 unix-socket.c              | 17 ++++++++++++-----
 unix-socket.h              |  4 +++-
 3 files changed, 16 insertions(+), 7 deletions(-)

Comments

Junio C Hamano March 3, 2021, 10:53 p.m. UTC | #1
"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Calls to `chdir()` are dangerous in a multi-threaded context.  If
> `unix_stream_listen()` or `unix_stream_connect()` is given a socket
> pathname that is too long to fit in a `sockaddr_un` structure, it will
> `chdir()` to the parent directory of the requested socket pathname,
> create the socket using a relative pathname, and then `chdir()` back.
> This is not thread-safe.
>
> Teach `unix_sockaddr_init()` to not allow calls to `chdir()` when this
> flag is set.

While it is clear that this will not affect any existing callers, I
am not sure if this is a good direction to go in the longer term.

I have to wonder if somebody actually relies on this "feature",
though.  As long as ENAMETOOLONG is passed back to the caller so
that it can react to it, any caller that knows it is safe to chdir()
at the point of calling "send_request()" should be able to chdir()
itself and come back (or fork a child that chdirs and opens a unix
domain socket there, and then send the file descriptor back to the
parent process).

Thanks.
Jeff King March 4, 2021, 2:56 p.m. UTC | #2
On Wed, Mar 03, 2021 at 02:53:23PM -0800, Junio C Hamano wrote:

> "Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > From: Jeff Hostetler <jeffhost@microsoft.com>
> >
> > Calls to `chdir()` are dangerous in a multi-threaded context.  If
> > `unix_stream_listen()` or `unix_stream_connect()` is given a socket
> > pathname that is too long to fit in a `sockaddr_un` structure, it will
> > `chdir()` to the parent directory of the requested socket pathname,
> > create the socket using a relative pathname, and then `chdir()` back.
> > This is not thread-safe.
> >
> > Teach `unix_sockaddr_init()` to not allow calls to `chdir()` when this
> > flag is set.
> 
> While it is clear that this will not affect any existing callers, I
> am not sure if this is a good direction to go in the longer term.
> 
> I have to wonder if somebody actually relies on this "feature",
> though.  As long as ENAMETOOLONG is passed back to the caller so
> that it can react to it, any caller that knows it is safe to chdir()
> at the point of calling "send_request()" should be able to chdir()
> itself and come back (or fork a child that chdirs and opens a unix
> domain socket there, and then send the file descriptor back to the
> parent process).

The feature is definitely useful; I think I did 1eb10f4091 (unix-socket:
handle long socket pathnames, 2012-01-09) in response to a real problem.

Certainly callers could handle the error themselves. The reason I pushed
it down into the socket code was to avoid having to implement in
multiple callers. There are only two, but we'd have needed it in both
sides (credential-cache--daemon as the listener, and credential-cache as
the client).

Ironically, the listening side now does a permanent chdir() to the
socket directory anyway, since 6e61449051 (credential-cache--daemon:
change to the socket dir on startup, 2016-02-23). So we could just do
that first, and then feed the basename to the socket code.

The client side would still need to handle it, though. It could probably
also chdir to the socket directory without any real downside (once
started, I don't think the helper program needs to access the filesystem
at all outside of the socket).

So I dunno. I'd be OK to just rip the feature out in favor of doing
those chdir()s. But that seems like a non-zero amount of work versus
leaving, and the existing code has the benefit that if another caller
shows up, it could benefit from the feature.

-Peff
Junio C Hamano March 4, 2021, 8:34 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

> The feature is definitely useful; I think I did 1eb10f4091 (unix-socket:
> handle long socket pathnames, 2012-01-09) in response to a real problem.
>
> Certainly callers could handle the error themselves. The reason I pushed
> it down into the socket code was to avoid having to implement in
> multiple callers. There are only two, but we'd have needed it in both
> sides (credential-cache--daemon as the listener, and credential-cache as
> the client).
>
> Ironically, the listening side now does a permanent chdir() to the
> socket directory anyway, since 6e61449051 (credential-cache--daemon:
> change to the socket dir on startup, 2016-02-23). So we could just do
> that first, and then feed the basename to the socket code.
>
> The client side would still need to handle it, though. It could probably
> also chdir to the socket directory without any real downside (once
> started, I don't think the helper program needs to access the filesystem
> at all outside of the socket).
>
> So I dunno. I'd be OK to just rip the feature out in favor of doing
> those chdir()s. But that seems like a non-zero amount of work versus
> leaving, and the existing code has the benefit that if another caller
> shows up, it could benefit from the feature.

I am OK to keep the series as-is, and leave it to a possible future
work to remove the need for chdir even for long paths and not having
to return an error with ENAMETOOLONG; when such an update happens,
the "fail if need to chdir" feature this patch is adding will become
a no-op.
Junio C Hamano March 4, 2021, 11:34 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

>> So I dunno. I'd be OK to just rip the feature out in favor of doing
>> those chdir()s. But that seems like a non-zero amount of work versus
>> leaving, and the existing code has the benefit that if another caller
>> shows up, it could benefit from the feature.
>
> I am OK to keep the series as-is, and leave it to a possible future
> work to remove the need for chdir even for long paths and not having
> to return an error with ENAMETOOLONG; when such an update happens,
> the "fail if need to chdir" feature this patch is adding will become
> a no-op.

For example, as this is UNIX-only codepath, I wonder if something
like this would be a good way to avoid chdir() that would cause
trouble.

    - obtain a fd from socket(2)
    - check if path is too long to fit in sa->sun_path
      - if it does, bind(2) the fd to the address
      - if it does not, fork(2) a child and
        - in the child, chdir(2) there and use the shortened path
	  to bind(2), and exit(3)
        - the parents just wait(2)s for the child to return. By the
          time it dies, the fd would be successfully bound to the
	  path.
    - now we have a file descriptor that is bound at that path.
Jeff King March 5, 2021, 9:02 a.m. UTC | #5
On Thu, Mar 04, 2021 at 03:34:07PM -0800, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> >> So I dunno. I'd be OK to just rip the feature out in favor of doing
> >> those chdir()s. But that seems like a non-zero amount of work versus
> >> leaving, and the existing code has the benefit that if another caller
> >> shows up, it could benefit from the feature.
> >
> > I am OK to keep the series as-is, and leave it to a possible future
> > work to remove the need for chdir even for long paths and not having
> > to return an error with ENAMETOOLONG; when such an update happens,
> > the "fail if need to chdir" feature this patch is adding will become
> > a no-op.
> 
> For example, as this is UNIX-only codepath, I wonder if something
> like this would be a good way to avoid chdir() that would cause
> trouble.
> 
>     - obtain a fd from socket(2)
>     - check if path is too long to fit in sa->sun_path
>       - if it does, bind(2) the fd to the address
>       - if it does not, fork(2) a child and
>         - in the child, chdir(2) there and use the shortened path
> 	  to bind(2), and exit(3)
>         - the parents just wait(2)s for the child to return. By the
>           time it dies, the fd would be successfully bound to the
> 	  path.
>     - now we have a file descriptor that is bound at that path.

If the trouble is that chdir() isn't thread-safe, I wonder if fork()
creates its own headaches. :) I guess libc usually takes care of the
basics with pthread_atfork(), etc, and the child otherwise would not
need to access much data.

I don't know offhand if this trick actually works. I can imagine it
does, but it hinges on the subtlety between an integer descriptor and
the underlying "file description" (the term used in POSIX). Does binding
a socket operate on the former (like close() does not close the parent's
descriptor) or the latter (like lseek() impacts other descriptors).

I'd guess the latter, but I wasn't sure if you were suggesting this from
experience or if you just invented the technique. ;)

-Peff
Jeff King March 5, 2021, 9:25 a.m. UTC | #6
On Fri, Mar 05, 2021 at 04:02:16AM -0500, Jeff King wrote:

> I don't know offhand if this trick actually works. I can imagine it
> does, but it hinges on the subtlety between an integer descriptor and
> the underlying "file description" (the term used in POSIX). Does binding
> a socket operate on the former (like close() does not close the parent's
> descriptor) or the latter (like lseek() impacts other descriptors).
> 
> I'd guess the latter, but I wasn't sure if you were suggesting this from
> experience or if you just invented the technique. ;)

I was curious, but this does indeed work:

-- >8 --
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdlib.h>

int main(void)
{
	int listen_fd, client_fd;
	struct addrinfo *ai;
	pid_t pid;

	getaddrinfo("127.0.0.1", "1234", NULL, &ai);
	listen_fd = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
	pid = fork();
	if (!pid) {
		bind(listen_fd, ai->ai_addr, ai->ai_addrlen);
		return 0;
	}
	waitpid(pid, NULL, 0);

	listen(listen_fd, 5);
	client_fd = accept(listen_fd, NULL, NULL);
	write(client_fd, "foo\n", 4);
	return 0;
}
-- >8 --

-Peff
Chris Torek March 5, 2021, 11:59 a.m. UTC | #7
> On Fri, Mar 05, 2021 at 04:02:16AM -0500, Jeff King wrote:
>
> > I don't know offhand if this [bind in a child] trick actually works. ...

On Fri, Mar 5, 2021 at 1:29 AM Jeff King <peff@peff.net> wrote:
> I was curious, but this does indeed work:
[working example snipped]

Yes, it definitely works.  The bind() call, on a Unix domain socket,
creates a file system entity linked to the underlying socket instance.
The file descriptors, in whatever processes have them, provide
read/write/send/recv/etc linkage to the underlying socket instance
(and also a refcount or other GC protection: with the ability to
send sockets over sockets, simple refcounts stop working and we
need real GC in the kernel...).

Of course, once all the file descriptor references are gone, the
socket (eventually, depending on GC) evaporates.  The file system
entity does not count for keeping the underlying socket alive.  At
this point the file system entity is "dead".  Unfortunately there's no
way to test and clean out the dead entity atomically.  The whole
thing is kind of a mess.

Chris
Jeff Hostetler March 5, 2021, 5:33 p.m. UTC | #8
On 3/5/21 6:59 AM, Chris Torek wrote:
>> On Fri, Mar 05, 2021 at 04:02:16AM -0500, Jeff King wrote:
>>
>>> I don't know offhand if this [bind in a child] trick actually works. ...
> 
> On Fri, Mar 5, 2021 at 1:29 AM Jeff King <peff@peff.net> wrote:
>> I was curious, but this does indeed work:
> [working example snipped]
> 
> Yes, it definitely works.  The bind() call, on a Unix domain socket,
> creates a file system entity linked to the underlying socket instance.
> The file descriptors, in whatever processes have them, provide
> read/write/send/recv/etc linkage to the underlying socket instance
> (and also a refcount or other GC protection: with the ability to
> send sockets over sockets, simple refcounts stop working and we
> need real GC in the kernel...).
> 
> Of course, once all the file descriptor references are gone, the
> socket (eventually, depending on GC) evaporates.  The file system
> entity does not count for keeping the underlying socket alive.  At
> this point the file system entity is "dead".  Unfortunately there's no
> way to test and clean out the dead entity atomically.  The whole
> thing is kind of a mess.
> 
> Chris
> 

The original problem was that chdir() is not safe in a multi-threaded
process because one thread calling chdir() will affect any concurrent
file operations (open(), mkdir(), etc.) that use relative paths.

I think Adding a fork() at this layer would just create new types of 
problems.  For example, if another thread was concurrently writing to
a socket while we were setting up this new socket, we would suddenly
have 1 thread in each process now writing to that socket and the
receiver would get a mixture of output from both processes.  Right?

Jeff
Junio C Hamano March 5, 2021, 5:53 p.m. UTC | #9
Jeff Hostetler <git@jeffhostetler.com> writes:

> The original problem was that chdir() is not safe in a multi-threaded
> process because one thread calling chdir() will affect any concurrent
> file operations (open(), mkdir(), etc.) that use relative paths.
>
> I think Adding a fork() at this layer would just create new types of
> problems.  For example, if another thread was concurrently writing to
> a socket while we were setting up this new socket, we would suddenly
> have 1 thread in each process now writing to that socket and the
> receiver would get a mixture of output from both processes.  Right?

cf. https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html

The fork() function shall create a new process. The new process
(child process) shall be an exact copy of the calling process
(parent process) except as detailed below:

...

 * A process shall be created with a single thread. If a
   multi-threaded process calls fork(), the new process shall
   contain a replica of the calling thread and its entire address
   space, possibly including the states of mutexes and other
   resources. Consequently, to avoid errors, the child process may
   only execute async-signal-safe operations until such time as one
   of the exec functions is called.

So, probably not.
Jeff Hostetler March 5, 2021, 9:30 p.m. UTC | #10
On 3/4/21 3:34 PM, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
>> The feature is definitely useful; I think I did 1eb10f4091 (unix-socket:
>> handle long socket pathnames, 2012-01-09) in response to a real problem.
>>
>> Certainly callers could handle the error themselves. The reason I pushed
>> it down into the socket code was to avoid having to implement in
>> multiple callers. There are only two, but we'd have needed it in both
>> sides (credential-cache--daemon as the listener, and credential-cache as
>> the client).
>>
>> Ironically, the listening side now does a permanent chdir() to the
>> socket directory anyway, since 6e61449051 (credential-cache--daemon:
>> change to the socket dir on startup, 2016-02-23). So we could just do
>> that first, and then feed the basename to the socket code.
>>
>> The client side would still need to handle it, though. It could probably
>> also chdir to the socket directory without any real downside (once
>> started, I don't think the helper program needs to access the filesystem
>> at all outside of the socket).
>>
>> So I dunno. I'd be OK to just rip the feature out in favor of doing
>> those chdir()s. But that seems like a non-zero amount of work versus
>> leaving, and the existing code has the benefit that if another caller
>> shows up, it could benefit from the feature.
> 
> I am OK to keep the series as-is, and leave it to a possible future
> work to remove the need for chdir even for long paths and not having
> to return an error with ENAMETOOLONG; when such an update happens,
> the "fail if need to chdir" feature this patch is adding will become
> a no-op.
> 

I think I'd like to keep things as I have them now with the "disallow
chdir()" option bit and save the "fork() / bind()" solution for a
later patch series.  Simple IPC is large enough as it is and the new
ENAMETOOLONG error will only affect callers who set the bit.  A later
patch series can easily test and confirm the "fork() / bind() solution
in isolation and test it on the other Unix hosts and then remove the
bit from those callers (if we want).

Jeff
Junio C Hamano March 5, 2021, 9:52 p.m. UTC | #11
Jeff Hostetler <git@jeffhostetler.com> writes:

>> I am OK to keep the series as-is, and leave it to a possible future
>> work to remove the need for chdir even for long paths and not having
>> to return an error with ENAMETOOLONG; when such an update happens,
>> the "fail if need to chdir" feature this patch is adding will become
>> a no-op.
>
> I think I'd like to keep things as I have them now with the "disallow
> chdir()" option bit

So we are on the same page.

> and save the "fork() / bind()" solution for a
> later patch series.  Simple IPC is large enough as it is and the new
> ENAMETOOLONG error will only affect callers who set the bit.  A later
> patch series can easily test and confirm the "fork() / bind() solution
> in isolation and test it on the other Unix hosts and then remove the
> bit from those callers (if we want).

The bit will then become an unused API relic, but that is OK (I do
not think fork/bind would be the best and/or only way to avoid
chdir, though, but it won't matter in the context ofh tis
discussion).
diff mbox series

Patch

diff --git a/builtin/credential-cache.c b/builtin/credential-cache.c
index 9b3f70990597..76a6ba37223f 100644
--- a/builtin/credential-cache.c
+++ b/builtin/credential-cache.c
@@ -14,7 +14,7 @@ 
 static int send_request(const char *socket, const struct strbuf *out)
 {
 	int got_data = 0;
-	int fd = unix_stream_connect(socket);
+	int fd = unix_stream_connect(socket, 0);
 
 	if (fd < 0)
 		return -1;
diff --git a/unix-socket.c b/unix-socket.c
index 5ac7dafe9828..1eaa8cf759c0 100644
--- a/unix-socket.c
+++ b/unix-socket.c
@@ -28,16 +28,23 @@  static void unix_sockaddr_cleanup(struct unix_sockaddr_context *ctx)
 }
 
 static int unix_sockaddr_init(struct sockaddr_un *sa, const char *path,
-			      struct unix_sockaddr_context *ctx)
+			      struct unix_sockaddr_context *ctx,
+			      int disallow_chdir)
 {
 	int size = strlen(path) + 1;
 
 	ctx->orig_dir = NULL;
 	if (size > sizeof(sa->sun_path)) {
-		const char *slash = find_last_dir_sep(path);
+		const char *slash;
 		const char *dir;
 		struct strbuf cwd = STRBUF_INIT;
 
+		if (disallow_chdir) {
+			errno = ENAMETOOLONG;
+			return -1;
+		}
+
+		slash = find_last_dir_sep(path);
 		if (!slash) {
 			errno = ENAMETOOLONG;
 			return -1;
@@ -63,13 +70,13 @@  static int unix_sockaddr_init(struct sockaddr_un *sa, const char *path,
 	return 0;
 }
 
-int unix_stream_connect(const char *path)
+int unix_stream_connect(const char *path, int disallow_chdir)
 {
 	int fd = -1, saved_errno;
 	struct sockaddr_un sa;
 	struct unix_sockaddr_context ctx;
 
-	if (unix_sockaddr_init(&sa, path, &ctx) < 0)
+	if (unix_sockaddr_init(&sa, path, &ctx, disallow_chdir) < 0)
 		return -1;
 	fd = socket(AF_UNIX, SOCK_STREAM, 0);
 	if (fd < 0)
@@ -99,7 +106,7 @@  int unix_stream_listen(const char *path,
 
 	unlink(path);
 
-	if (unix_sockaddr_init(&sa, path, &ctx) < 0)
+	if (unix_sockaddr_init(&sa, path, &ctx, opts->disallow_chdir) < 0)
 		return -1;
 	fd = socket(AF_UNIX, SOCK_STREAM, 0);
 	if (fd < 0)
diff --git a/unix-socket.h b/unix-socket.h
index 06a5a05b03fe..2c0b2e79d7b3 100644
--- a/unix-socket.h
+++ b/unix-socket.h
@@ -3,6 +3,7 @@ 
 
 struct unix_stream_listen_opts {
 	int listen_backlog_size;
+	unsigned int disallow_chdir:1;
 };
 
 #define DEFAULT_UNIX_STREAM_LISTEN_BACKLOG (5)
@@ -10,9 +11,10 @@  struct unix_stream_listen_opts {
 #define UNIX_STREAM_LISTEN_OPTS_INIT \
 { \
 	.listen_backlog_size = DEFAULT_UNIX_STREAM_LISTEN_BACKLOG, \
+	.disallow_chdir = 0, \
 }
 
-int unix_stream_connect(const char *path);
+int unix_stream_connect(const char *path, int disallow_chdir);
 int unix_stream_listen(const char *path,
 		       const struct unix_stream_listen_opts *opts);