diff mbox series

[09/23] fsmonitor-settings: remote repos on macOS are incompatible with FSMonitor

Message ID 412fbc45868e7f2a05a03e424584b53ded4842e9.1644940774.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Builtin FSMonitor Part 3 | expand

Commit Message

Jeff Hostetler Feb. 15, 2022, 3:59 p.m. UTC
From: Jeff Hostetler <jeffhost@microsoft.com>

Teach Git to detect remote working directories on macOS and mark them as
incompatible with FSMonitor.

With this, `git fsmonitor--daemon run` will error out with a message
like it does for bare repos.

Client commands, like `git status`, will not attempt to start the daemon.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsm-settings-darwin.c | 66 ++++++++++++++++++++++++++
 fsmonitor-settings.c                   |  5 ++
 fsmonitor-settings.h                   |  1 +
 3 files changed, 72 insertions(+)

Comments

Derrick Stolee Feb. 24, 2022, 3:26 p.m. UTC | #1
On 2/15/2022 10:59 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Teach Git to detect remote working directories on macOS and mark them as
> incompatible with FSMonitor.
> 
> With this, `git fsmonitor--daemon run` will error out with a message
> like it does for bare repos.
> 
> Client commands, like `git status`, will not attempt to start the daemon.

...

> + * A client machine (such as a laptop) may choose to suspend/resume
> + * and it is unclear (without lots of testing) whether the watcher can
> + * resync after a resume.  We might be able to treat this as a normal
> + * "events were dropped by the kernel" event and do our normal "flush
> + * and resync" --or-- we might need to close the existing (zombie?)
> + * notification fd and create a new one.
> + *
> + * In theory, the above issues need to be addressed whether we are
> + * using the Hook or IPC API.

The only thing I can think about is a case where the filesystem
monitor is actually running on the remote machine and Git
communicates with it over the network. This is currently possible
with the hook, but I am not aware of a hook implementation that
does this.

We can find a way to update the hook interface to communicate to
Git that a remote disk is an appropriate case, but only if there
is a real need for that.

> + * For the builtin FSMonitor, we create the Unix domain socket for the
> + * IPC in the .git directory.  If the working directory is remote,
> + * then the socket will be created on the remote file system.  This

The socket is on the remote file system, but the daemon process is still
local, so I still see this a problem.

> + * can fail if the remote file system does not support UDS file types
> + * (e.g. smbfs to a Windows server) or if the remote kernel does not
> + * allow a non-local process to bind() the socket.  (These problems
> + * could be fixed by moving the UDS out of the .git directory and to a
> + * well-known local directory on the client machine, but care should
> + * be taken to ensure that $HOME is actually local and not a managed
> + * file share.)
> + *
> + * So (for now at least), mark remote working directories as
> + * incompatible.
> + */
> +static enum fsmonitor_reason is_remote(struct repository *r)
> +{
> +	struct statfs fs;
> +
> +	if (statfs(r->worktree, &fs) == -1) {
> +		int saved_errno = errno;
> +		trace_printf_key(&trace_fsmonitor, "statfs('%s') failed: %s",
> +				 r->worktree, strerror(saved_errno));
> +		errno = saved_errno;
> +		return FSMONITOR_REASON_ZERO;

So if we fail to inspect the filesystem, we report it as compatible?
I suppose other things are likely to fail if checks like this are
fialing, but I wonder if we should preempt that by marking this as
incompatible due to filesystem errors.

> +	}
> +
> +	trace_printf_key(&trace_fsmonitor,
> +			 "statfs('%s') [type 0x%08x][flags 0x%08x] '%s'",
> +			 r->worktree, fs.f_type, fs.f_flags, fs.f_fstypename);
> +
> +	if (!(fs.f_flags & MNT_LOCAL))
> +		return FSMONITOR_REASON_REMOTE;

I do see that we need a successful response to give this specific
reason for incompatibility.

> +	return FSMONITOR_REASON_ZERO;
> +}
>  
>  enum fsmonitor_reason fsm_os__incompatible(struct repository *r)
>  {
> +	enum fsmonitor_reason reason;
> +
> +	reason = is_remote(r);
> +	if (reason)
> +		return reason;

This organization is looking like you want to short-circuit the
checks if you find an incompatibility, with the intent of having
multiple checks in the future.

But this can be done with simple || operators:

	return is_remote() ||
	       reason_check_2() ||
	       reason_check_3();

Thanks,
-Stolee
Jeff Hostetler March 1, 2022, 9:30 p.m. UTC | #2
On 2/24/22 10:26 AM, Derrick Stolee wrote:
> On 2/15/2022 10:59 AM, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Teach Git to detect remote working directories on macOS and mark them as
>> incompatible with FSMonitor.
>>
>> With this, `git fsmonitor--daemon run` will error out with a message
>> like it does for bare repos.
>>
>> Client commands, like `git status`, will not attempt to start the daemon.
> 
> ...
> 
>> + * A client machine (such as a laptop) may choose to suspend/resume
>> + * and it is unclear (without lots of testing) whether the watcher can
>> + * resync after a resume.  We might be able to treat this as a normal
>> + * "events were dropped by the kernel" event and do our normal "flush
>> + * and resync" --or-- we might need to close the existing (zombie?)
>> + * notification fd and create a new one.
>> + *
>> + * In theory, the above issues need to be addressed whether we are
>> + * using the Hook or IPC API.
> 
> The only thing I can think about is a case where the filesystem
> monitor is actually running on the remote machine and Git
> communicates with it over the network. This is currently possible
> with the hook, but I am not aware of a hook implementation that
> does this.
> 
> We can find a way to update the hook interface to communicate to
> Git that a remote disk is an appropriate case, but only if there
> is a real need for that.

I'm not saying we can't support remote working directories.
We do get events from SMB for example, but there network
buffer limits and all the usual connectivity issues with a
local daemon reading an event stream from a remote file system.
So out of caution, I want to disable it for now.  We can always
revisit it later.

WRT the builtin IPC-based daemon, I have the socket/named pipe
restricted to local access only.  This prevents a client like
"git status" from talking to a remote daemon, but it also prevents
a remote bad guy from talking our daemon.  But those are secondary
concerns right now -- I'm mainly concerned with correctly getting
all of the (possibly high volume/speed) events to the client.

There's also the possible incompatibilities of vendor X client-side
OS and vendor Y server-side OS and how well notifications are
supported between them....

So I'd just like to shut it off for now.

> 
>> + * For the builtin FSMonitor, we create the Unix domain socket for the
>> + * IPC in the .git directory.  If the working directory is remote,
>> + * then the socket will be created on the remote file system.  This
> 
> The socket is on the remote file system, but the daemon process is still
> local, so I still see this a problem.
> 
>> + * can fail if the remote file system does not support UDS file types
>> + * (e.g. smbfs to a Windows server) or if the remote kernel does not
>> + * allow a non-local process to bind() the socket.  (These problems
>> + * could be fixed by moving the UDS out of the .git directory and to a
>> + * well-known local directory on the client machine, but care should
>> + * be taken to ensure that $HOME is actually local and not a managed
>> + * file share.)
>> + *
>> + * So (for now at least), mark remote working directories as
>> + * incompatible.
>> + */
>> +static enum fsmonitor_reason is_remote(struct repository *r)
>> +{
>> +	struct statfs fs;
>> +
>> +	if (statfs(r->worktree, &fs) == -1) {
>> +		int saved_errno = errno;
>> +		trace_printf_key(&trace_fsmonitor, "statfs('%s') failed: %s",
>> +				 r->worktree, strerror(saved_errno));
>> +		errno = saved_errno;
>> +		return FSMONITOR_REASON_ZERO;
> 
> So if we fail to inspect the filesystem, we report it as compatible?
> I suppose other things are likely to fail if checks like this are
> fialing, but I wonder if we should preempt that by marking this as
> incompatible due to filesystem errors.

Good point.

> 
>> +	}
>> +
>> +	trace_printf_key(&trace_fsmonitor,
>> +			 "statfs('%s') [type 0x%08x][flags 0x%08x] '%s'",
>> +			 r->worktree, fs.f_type, fs.f_flags, fs.f_fstypename);
>> +
>> +	if (!(fs.f_flags & MNT_LOCAL))
>> +		return FSMONITOR_REASON_REMOTE;
> 
> I do see that we need a successful response to give this specific
> reason for incompatibility.
> 
>> +	return FSMONITOR_REASON_ZERO;
>> +}
>>   
>>   enum fsmonitor_reason fsm_os__incompatible(struct repository *r)
>>   {
>> +	enum fsmonitor_reason reason;
>> +
>> +	reason = is_remote(r);
>> +	if (reason)
>> +		return reason;
> 
> This organization is looking like you want to short-circuit the
> checks if you find an incompatibility, with the intent of having
> multiple checks in the future.
> 
> But this can be done with simple || operators:
> 
> 	return is_remote() ||
> 	       reason_check_2() ||
> 	       reason_check_3();

True.  That is a little shorter.

Thanks
Jeff
diff mbox series

Patch

diff --git a/compat/fsmonitor/fsm-settings-darwin.c b/compat/fsmonitor/fsm-settings-darwin.c
index 176a6f5726c..c3b719d6fb0 100644
--- a/compat/fsmonitor/fsm-settings-darwin.c
+++ b/compat/fsmonitor/fsm-settings-darwin.c
@@ -2,8 +2,74 @@ 
 #include "config.h"
 #include "repository.h"
 #include "fsmonitor-settings.h"
+#include "fsmonitor.h"
+#include <sys/param.h>
+#include <sys/mount.h>
+
+/*
+ * Remote working directories are problematic for FSMonitor.
+ *
+ * The underlying file system on the server machine and/or the remote
+ * mount type (NFS, SAMBA, etc.) dictates whether notification events
+ * are available at all to remote client machines.
+ *
+ * Kernel differences between the server and client machines also
+ * dictate the how (buffering, frequency, de-dup) the events are
+ * delivered to client machine processes.
+ *
+ * A client machine (such as a laptop) may choose to suspend/resume
+ * and it is unclear (without lots of testing) whether the watcher can
+ * resync after a resume.  We might be able to treat this as a normal
+ * "events were dropped by the kernel" event and do our normal "flush
+ * and resync" --or-- we might need to close the existing (zombie?)
+ * notification fd and create a new one.
+ *
+ * In theory, the above issues need to be addressed whether we are
+ * using the Hook or IPC API.
+ *
+ * For the builtin FSMonitor, we create the Unix domain socket for the
+ * IPC in the .git directory.  If the working directory is remote,
+ * then the socket will be created on the remote file system.  This
+ * can fail if the remote file system does not support UDS file types
+ * (e.g. smbfs to a Windows server) or if the remote kernel does not
+ * allow a non-local process to bind() the socket.  (These problems
+ * could be fixed by moving the UDS out of the .git directory and to a
+ * well-known local directory on the client machine, but care should
+ * be taken to ensure that $HOME is actually local and not a managed
+ * file share.)
+ *
+ * So (for now at least), mark remote working directories as
+ * incompatible.
+ */
+static enum fsmonitor_reason is_remote(struct repository *r)
+{
+	struct statfs fs;
+
+	if (statfs(r->worktree, &fs) == -1) {
+		int saved_errno = errno;
+		trace_printf_key(&trace_fsmonitor, "statfs('%s') failed: %s",
+				 r->worktree, strerror(saved_errno));
+		errno = saved_errno;
+		return FSMONITOR_REASON_ZERO;
+	}
+
+	trace_printf_key(&trace_fsmonitor,
+			 "statfs('%s') [type 0x%08x][flags 0x%08x] '%s'",
+			 r->worktree, fs.f_type, fs.f_flags, fs.f_fstypename);
+
+	if (!(fs.f_flags & MNT_LOCAL))
+		return FSMONITOR_REASON_REMOTE;
+
+	return FSMONITOR_REASON_ZERO;
+}
 
 enum fsmonitor_reason fsm_os__incompatible(struct repository *r)
 {
+	enum fsmonitor_reason reason;
+
+	reason = is_remote(r);
+	if (reason)
+		return reason;
+
 	return FSMONITOR_REASON_ZERO;
 }
diff --git a/fsmonitor-settings.c b/fsmonitor-settings.c
index bb2ddd2457f..de69ace246a 100644
--- a/fsmonitor-settings.c
+++ b/fsmonitor-settings.c
@@ -161,6 +161,11 @@  static void create_reason_message(struct repository *r,
 			      _("virtual repos are incompatible with fsmonitor"));
 		return;
 
+	case FSMONITOR_REASON_REMOTE:
+		strbuf_addstr(buf_reason,
+			      _("remote repos are incompatible with fsmonitor"));
+		return;
+
 	default:
 		BUG("Unhandled case in create_reason_message '%d'", s->reason);
 	}
diff --git a/fsmonitor-settings.h b/fsmonitor-settings.h
index c169683bf2d..fca25887c0f 100644
--- a/fsmonitor-settings.h
+++ b/fsmonitor-settings.h
@@ -17,6 +17,7 @@  enum fsmonitor_reason {
 	FSMONITOR_REASON_ZERO = 0,
 	FSMONITOR_REASON_BARE = 1,
 	FSMONITOR_REASON_VIRTUAL = 2,
+	FSMONITOR_REASON_REMOTE = 3,
 };
 
 void fsm_settings__set_ipc(struct repository *r);