Message ID | pull.1326.git.1660855703816.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | fsmonitor: option to allow fsmonitor to run against network-mounted repos | expand |
"Eric DeCosta via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Eric DeCosta <edecosta@mathworks.com> > > Though perhaps not common, there are uses cases where users have large, > network-mounted repos. Having the ability to run fsmonitor against > network paths would benefit those users. > > As a first step towards enabling fsmonitor to work ... Didn't we already see the first step and got it reviewed? I think I've already merged the last round to 'next'. Puzzled...
Junio C Hamano <gitster@pobox.com> writes: > "Eric DeCosta via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> From: Eric DeCosta <edecosta@mathworks.com> >> >> Though perhaps not common, there are uses cases where users have large, >> network-mounted repos. Having the ability to run fsmonitor against >> network paths would benefit those users. >> >> As a first step towards enabling fsmonitor to work ... > > Didn't we already see the first step and got it reviewed? I think > I've already merged the last round to 'next'. > > Puzzled... Ah, OK, so this is not exactly the "first step", but builds on top of the previous one? At least the patch needs a better title to make it distinguishable from the other one. Will queue to wait for others to review. Thanks.
Hi Eric, On Thu, 18 Aug 2022, Eric DeCosta via GitGitGadget wrote: > From: Eric DeCosta <edecosta@mathworks.com> > > Though perhaps not common, there are uses cases where users have large, > network-mounted repos. Having the ability to run fsmonitor against > network paths would benefit those users. > > As a first step towards enabling fsmonitor to work against > network-mounted repos, a configuration option, 'fsmonitor.allowRemote' > was introduced for Windows. If you start the commit message along the following lines, it might be easier/quicker to grok the context for the keen reader: In 85dc0da6dcf (fsmonitor: option to allow fsmonitor to run against network-mounted repos, 2022-08-11), the Windows backend of the FSMonitor learned to allow running on network drives, via the `fsmonitor.allowRemote` config setting. > Setting this option to true will override the default behavior > (erroring-out) when a network-mounted repo is detected by fsmonitor. In > order for macOS to have parity with Windows, the same option is now > introduced for macOS. > > The the added wrinkle being that the Unix domain socket (UDS) file > used for IPC cannot be created in a network location; instead the > temporary directory is used. Thank you very much for this note, after a cursory read I expected that part of the code to be a left-over from some "We know better than the user" type of automatic default, and this paragraph definitely helped me overcome that expectation. > > Signed-off-by: Eric DeCosta <edecosta@mathworks.com> > --- > fsmonitor: option to allow fsmonitor to run against network-mounted > repos > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1326%2Fedecosta-mw%2Ffsmonitor_macos-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1326/edecosta-mw/fsmonitor_macos-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/1326 > > compat/fsmonitor/fsm-settings-darwin.c | 77 ++++++++++++++++++++++---- > fsmonitor-ipc.c | 47 +++++++++++++++- > fsmonitor-ipc.h | 6 ++ > 3 files changed, 117 insertions(+), 13 deletions(-) I am somewhat puzzled that this has no corresponding change to `Documentation/`. And now I realize that this was the case also for the patch adding `fsmonitor.allowRemote` support for Windows. Could I ask you to add a patch to document this config setting? > > diff --git a/compat/fsmonitor/fsm-settings-darwin.c b/compat/fsmonitor/fsm-settings-darwin.c > index efc732c0f31..9e2ea3b90cc 100644 > --- a/compat/fsmonitor/fsm-settings-darwin.c > +++ b/compat/fsmonitor/fsm-settings-darwin.c > @@ -2,10 +2,28 @@ > #include "config.h" > #include "repository.h" > #include "fsmonitor-settings.h" > +#include "fsmonitor-ipc.h" > #include "fsmonitor.h" > #include <sys/param.h> > #include <sys/mount.h> > > +/* > + * Check if monitoring remote working directories is allowed. > + * > + * By default, monitoring remote working directories is > + * disabled. Users may override this behavior in enviroments where > + * they have proper support. > + */ > +static int check_config_allowremote(struct repository *r) > +{ > + int allow; > + > + if (!repo_config_get_bool(r, "fsmonitor.allowremote", &allow)) > + return allow; > + > + return -1; /* fsmonitor.allowremote not set */ > +} > + > /* > * [1] Remote working directories are problematic for FSMonitor. > * > @@ -27,24 +45,22 @@ > * In theory, the above issues need to be addressed whether we are > * using the Hook or IPC API. > * > + * So (for now at least), mark remote working directories as > + * incompatible by default. > + * This was moved up, okay. > * For the builtin FSMonitor, we create the Unix domain socket for the > - * IPC in the .git directory. If the working directory is remote, > - * then the socket will be created on the remote file system. This > - * can fail if the remote file system does not support UDS file types > - * (e.g. smbfs to a Windows server) or if the remote kernel does not > - * allow a non-local process to bind() the socket. (These problems > - * could be fixed by moving the UDS out of the .git directory and to a > - * well-known local directory on the client machine, but care should > - * be taken to ensure that $HOME is actually local and not a managed > - * file share.) > + * IPC in the temporary directory. If the temporary directory is This is incorrect. It is still the `.git` directory in the common case, not a temporary directory. > + * remote, then the socket will be created on the remote file system. > + * This can fail if the remote file system does not support UDS file > + * types (e.g. smbfs to a Windows server) or if the remote kernel does > + * not allow a non-local process to bind() the socket. > * > - * So (for now at least), mark remote working directories as > - * incompatible. > + * Therefore remote UDS locations are marked as incompatible. > * > * > * [2] FAT32 and NTFS working directories are problematic too. Doesn't this patch address this, too? See below for more on that. > * > - * The builtin FSMonitor uses a Unix domain socket in the .git > + * The builtin FSMonitor uses a Unix domain socket in the temporary > * directory for IPC. These Windows drive formats do not support > * Unix domain sockets, so mark them as incompatible for the daemon. > * > @@ -65,6 +81,39 @@ static enum fsmonitor_reason check_volume(struct repository *r) > "statfs('%s') [type 0x%08x][flags 0x%08x] '%s'", > r->worktree, fs.f_type, fs.f_flags, fs.f_fstypename); > > + if (!(fs.f_flags & MNT_LOCAL)) { > + switch (check_config_allowremote(r)) { > + case 0: /* config overrides and disables */ > + return FSMONITOR_REASON_REMOTE; > + case 1: /* config overrides and enables */ > + return FSMONITOR_REASON_OK; > + default: > + break; /* config has no opinion */ > + } > + > + return FSMONITOR_REASON_REMOTE; > + } This `switch()` statement sounds like a verbose way to say the same as: return check_config_allowremote(r) == 1 ? FSMONITOR_REASON_OK : FSMONITOR_REASON_REMOTE; > + > + return FSMONITOR_REASON_OK; > +} > + > +static enum fsmonitor_reason check_uds_volume(void) What's an UDS volume? Do you mean to say "Unix Domain Socket Volume"? If so, it would be better to turn this into a function called `filesystem_supports_unix_sockets()` and to return an `int`, 1 for "yes", 0 for "no". > +{ > + struct statfs fs; > + const char *path = fsmonitor_ipc__get_path(); > + > + if (statfs(path, &fs) == -1) { > + int saved_errno = errno; > + trace_printf_key(&trace_fsmonitor, "statfs('%s') failed: %s", > + path, strerror(saved_errno)); > + errno = saved_errno; > + return FSMONITOR_REASON_ERROR; > + } > + > + trace_printf_key(&trace_fsmonitor, > + "statfs('%s') [type 0x%08x][flags 0x%08x] '%s'", > + path, fs.f_type, fs.f_flags, fs.f_fstypename); > + > if (!(fs.f_flags & MNT_LOCAL)) > return FSMONITOR_REASON_REMOTE; > > @@ -85,5 +134,9 @@ enum fsmonitor_reason fsm_os__incompatible(struct repository *r) It is unfortunate that the diff hunk stops here, and mails lack a button to increase the diff context. In this instance, the hidden part of the `check_volume()` function is quite interesting: it returns `FSMONITOR_REASON_NOSOCKETS` for `msdos` and `ntfs` file systems. Which means that your patch changes behavior not only for remote file systems, but also for local ones without support for Unix sockets. To heed the principle of separation of concerns, please do split out that part. I would recommend to make it the first patch to support `msdos`/`ntfs` file systems (by registering the Unix sockets in a temporary directory instead of the `.git/` directory). The second patch can then introduce support for `fsmonitor.allowRemote` on macOS on top of the first patch. > if (reason != FSMONITOR_REASON_OK) > return reason; > > + reason = check_uds_volume(); > + if (reason != FSMONITOR_REASON_OK) > + return reason; > + > return FSMONITOR_REASON_OK; > } > diff --git a/fsmonitor-ipc.c b/fsmonitor-ipc.c > index 789e7397baa..6e9b40a03d5 100644 > --- a/fsmonitor-ipc.c > +++ b/fsmonitor-ipc.c > @@ -4,6 +4,7 @@ > #include "fsmonitor-ipc.h" > #include "run-command.h" > #include "strbuf.h" > +#include "tempfile.h" > #include "trace2.h" > > #ifndef HAVE_FSMONITOR_DAEMON_BACKEND > @@ -47,7 +48,51 @@ int fsmonitor_ipc__is_supported(void) > return 1; > } > > -GIT_PATH_FUNC(fsmonitor_ipc__get_path, "fsmonitor--daemon.ipc") > +GIT_PATH_FUNC(fsmonitor_ipc__get_pathfile, "fsmonitor--daemon.ipc") Why rename this? That's unnecessary chatter in the patch. Let's avoid such things in the future, it only costs reviewers time. > + > +static char *gen_ipc_file(void) > +{ > + char *retval = NULL; > + struct tempfile *ipc; > + > + const char *ipc_file = fsmonitor_ipc__get_pathfile(); > + FILE *fp = fopen(ipc_file, "w"); > + > + if (!fp) > + die_errno("error opening '%s'", ipc_file); > + ipc = mks_tempfile_t("fsmonitor_ipc_XXXXXX"); > + strbuf_write(&ipc->filename, fp); > + fclose(fp); > + retval = strbuf_detach(&ipc->filename, NULL); > + strbuf_release(&ipc->filename); > + return retval; > +} > + > +const char *fsmonitor_ipc__get_path(void) > +{ > + char *retval = NULL; > + struct strbuf sb = STRBUF_INIT; > + > + const char *ipc_file = fsmonitor_ipc__get_pathfile(); > + FILE *fp = fopen(ipc_file, "r"); > + > + if (!fp) { > + return gen_ipc_file(); > + } else { > + strbuf_read(&sb, fileno(fp), 0); > + fclose(fp); > + fp = fopen(sb.buf, "r"); > + if (!fp) { /* generate new file */ > + if (unlink(ipc_file) < 0) > + die_errno("could not remove '%s'", ipc_file); > + return gen_ipc_file(); > + } > + fclose(fp); > + retval = strbuf_detach(&sb, NULL); > + strbuf_release(&sb); > + return retval; > + } > +} I am afraid I do not understand how this code can guarantee a fixed path for the Unix domain socket. It _needs_ to be fixed so that a singleton daemon can run and listen on it, and an arbitrary number of Git clients can connect to it. If it is not fixed, you will cause Git to quite possibly start a new FSMonitor daemon for every invocation that wants to connect to an FSMonitor daemon. This means that the path of the Unix socket needs to have a 1:1 relationship to the path of the `.git/` directory. If you install it in that directory, that invariant is naturally fulfilled. If you want to install it elsewhere, you will have to come up with a reliable way to guarantee that connection. One option would be to install the Unix sockets in the home directory, under a name like `.git-fsmonitor-<hash>` where the <hash> is e.g. a SHA-1/SHA-256 of the canonicalized path of the `.git/` directory. > > enum ipc_active_state fsmonitor_ipc__get_state(void) > { > diff --git a/fsmonitor-ipc.h b/fsmonitor-ipc.h > index b6a7067c3af..63277dea39e 100644 > --- a/fsmonitor-ipc.h > +++ b/fsmonitor-ipc.h > @@ -18,6 +18,12 @@ int fsmonitor_ipc__is_supported(void); > */ > const char *fsmonitor_ipc__get_path(void); > > +/* > + * Returns the pathname to the file that contains the pathname to the > + * IPC named pipe or Unix domain socket. > + */ > +const char *fsmonitor_ipc__get_pathfile(void); > + > /* > * Try to determine whether there is a `git-fsmonitor--daemon` process > * listening on the IPC pipe/socket. Thank you for working on this, also on the Windows side. It definitely helps! Ciao, Dscho
On 8/18/22 4:48 PM, Eric DeCosta via GitGitGadget wrote: > From: Eric DeCosta <edecosta@mathworks.com> > > Though perhaps not common, there are uses cases where users have large, > network-mounted repos. Having the ability to run fsmonitor against > network paths would benefit those users. > > As a first step towards enabling fsmonitor to work against > network-mounted repos, a configuration option, 'fsmonitor.allowRemote' > was introduced for Windows. Setting this option to true will override > the default behavior (erroring-out) when a network-mounted repo is > detected by fsmonitor. In order for macOS to have parity with Windows, > the same option is now introduced for macOS. We might also say that this config option only allows FSMonitor to TRY to consider using a network-mounted repo. And that this ability is considered experimental until sufficient testing can be completed and we can determine the combinations of { client os } x { server os } x { remote access } x { file system type } that are known to work or not work and we can update the defaults and the documentation accordingly. For example, on a MacOS client, we expect the local "fseventsd" service to send us recursive events on all files and sub directories under the repo root. If the server is a Linux machine (which doesn't really do recursive events), does exporting the FS from the server over NFS or SMB (or whatever) cause the Linux host to send enough information to the client machine for fseventsd to synthesize the recursive event stream locally that FSMonitor expects. It might. It might not. That combination should be tested (along with a lot of other combinations). But again, this patch is just about allowing the (informed?) user to try it and begin testing various combinations. > > The the added wrinkle being that the Unix domain socket (UDS) file > used for IPC cannot be created in a network location; instead the > temporary directory is used. This scares me a bit. I put the socket in the .git directory so that we are guaranteed that only one daemon will run on the repository and that all clients will know where to find that socket (if it exists). It looks like you're creating the UDS using a tmp pathname and writing the pathname to the actual .git/fsmonitor--daemon.ipc FILE. This adds a layer of indirection and is prone to races. The act of creating the actual socket is protected by code in unix-socket.c and unix-stream-server.c to try to safely create the socket and avoid stepping on another active daemon (who currently has the server-side of the socket open). My code also detects dead sockets (where a previous daemon died and failed to delete the socket). Additionally, allowing remote access means that the repo could be used by multiple client machines and/or by the server machine itself. Consider the example of two MacOS clients mounting the remote repo and both wanting to start FSMonitor. They would constantly fight to recreate a new local-tmp-based socket and update your pathname FILE and end up invalidating each other on each command. Also, if someone overwrites your new pathname FILE, but doesn't tell the daemon, the daemon will be orphaned -- still running, but no one will ever connect to it because the FILE no longer points to it. There was a suggestion later in this thread about using a SHA-1 or SHA-256 hash of the pathname to avoid the tmp XXXXXX pattern and just put the socket in $HOME (and omit the need for the new fsmonitor-daemon.ipc FILE completely). This might work, but we need to be careful because a user might have hardlinks or symlinks locally so there may be more than one unique path to the repo on the local system. (It is OK to have more than one daemon listening to a repo, just less efficient.) As an interim step, you might try using my original socket code plus just the config.allowRemote=true change. And test it on a mounted repo where you've converted the .git directory to a .git file and moved contents of the .git directory to somewhere local. Then the UDS would be created in the local GITDIR instead of on the remote system. This won't help any of the sharing cases I described above, but will let you experiment with getting remote events. Jeff > base-commit: 9bf691b78cf906751e65d65ba0c6ffdcd9a5a12c >
On Thu, Aug 18, 2022 at 5:00 PM Eric DeCosta via GitGitGadget <gitgitgadget@gmail.com> wrote: > Though perhaps not common, there are uses cases where users have large, s/uses cases/use cases/ > network-mounted repos. Having the ability to run fsmonitor against > network paths would benefit those users. > > As a first step towards enabling fsmonitor to work against > network-mounted repos, a configuration option, 'fsmonitor.allowRemote' > was introduced for Windows. Setting this option to true will override > the default behavior (erroring-out) when a network-mounted repo is > detected by fsmonitor. In order for macOS to have parity with Windows, > the same option is now introduced for macOS. > > The the added wrinkle being that the Unix domain socket (UDS) file s/The the/The/ > used for IPC cannot be created in a network location; instead the > temporary directory is used. > > Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
> -----Original Message----- > From: Jeff Hostetler <git@jeffhostetler.com> > Sent: Friday, August 19, 2022 12:50 PM > To: Eric DeCosta via GitGitGadget <gitgitgadget@gmail.com>; > git@vger.kernel.org > Cc: Eric DeCosta <edecosta@mathworks.com> > Subject: Re: [PATCH] fsmonitor: option to allow fsmonitor to run against > network-mounted repos > > > > On 8/18/22 4:48 PM, Eric DeCosta via GitGitGadget wrote: > > From: Eric DeCosta <edecosta@mathworks.com> > > > > Though perhaps not common, there are uses cases where users have large, > > network-mounted repos. Having the ability to run fsmonitor against > > network paths would benefit those users. > > > > As a first step towards enabling fsmonitor to work against > > network-mounted repos, a configuration option, 'fsmonitor.allowRemote' > > was introduced for Windows. Setting this option to true will override > > the default behavior (erroring-out) when a network-mounted repo is > > detected by fsmonitor. In order for macOS to have parity with Windows, > > the same option is now introduced for macOS. > > We might also say that this config option only allows FSMonitor > to TRY to consider using a network-mounted repo. And that this > ability is considered experimental until sufficient testing can > be completed and we can determine the combinations of > { client os } x { server os } x { remote access } x { file system type } > that are known to work or not work and we can update the defaults > and the documentation accordingly. > Yes, very experimental. > For example, on a MacOS client, we expect the local "fseventsd" service > to send us recursive events on all files and sub directories under the > repo root. If the server is a Linux machine (which doesn't really do > recursive events), does exporting the FS from the server over NFS or SMB > (or whatever) cause the Linux host to send enough information to the > client machine for fseventsd to synthesize the recursive event stream > locally that FSMonitor expects. It might. It might not. That > combination should be tested (along with a lot of other combinations). > > But again, this patch is just about allowing the (informed?) user to > try it and begin testing various combinations. > Yes, the point is to allow users to try it out. Self-servingly, I have about 3K users who make heavy use of network-mounted sandboxes on the three major platforms; all connecting via NFS or SMB to Linux file servers. Hardly exhaustive, but the file system change notification APIs (inotify, FSEvents, and ReadDirectoryCHangesW) all seem to work correctly. Thus my motivation to work on this aspect of git :-) > > > > > The the added wrinkle being that the Unix domain socket (UDS) file > > used for IPC cannot be created in a network location; instead the > > temporary directory is used. > > This scares me a bit. I put the socket in the .git directory > so that we are guaranteed that only one daemon will run on the > repository and that all clients will know where to find that socket > (if it exists). > > It looks like you're creating the UDS using a tmp pathname and > writing the pathname to the actual .git/fsmonitor--daemon.ipc FILE. > This adds a layer of indirection and is prone to races. > Good point. > > The act of creating the actual socket is protected by code in > unix-socket.c and unix-stream-server.c to try to safely create > the socket and avoid stepping on another active daemon (who > currently has the server-side of the socket open). > > My code also detects dead sockets (where a previous daemon died > and failed to delete the socket). > > > Additionally, allowing remote access means that the repo could > be used by multiple client machines and/or by the server machine > itself. Consider the example of two MacOS clients mounting the > remote repo and both wanting to start FSMonitor. They would > constantly fight to recreate a new local-tmp-based socket and > update your pathname FILE and end up invalidating each other on > each command. > I see your point - they'd stomp on each other. As far as multiple client machines mounting the remote repo, I have doubts that FSMonitor would even see changes made from another machine. Worth trying out and documenting as needed - might even be better off being considered as unsupported. > > Also, if someone overwrites your new pathname FILE, but doesn't tell > the daemon, the daemon will be orphaned -- still running, but no one > will ever connect to it because the FILE no longer points to it. > True, thanks for pointing that out. > > There was a suggestion later in this thread about using a SHA-1 > or SHA-256 hash of the pathname to avoid the tmp XXXXXX pattern > and just put the socket in $HOME (and omit the need for the new > fsmonitor-daemon.ipc FILE completely). This might work, but we > need to be careful because a user might have hardlinks or symlinks > locally so there may be more than one unique path to the repo > on the local system. (It is OK to have more than one daemon > listening to a repo, just less efficient.) > Ah, I see. > > As an interim step, you might try using my original socket code > plus just the config.allowRemote=true change. And test it on a > mounted repo where you've converted the .git directory to a .git > file and moved contents of the .git directory to somewhere local. > Then the UDS would be created in the local GITDIR instead of on > the remote system. This won't help any of the sharing cases I > described above, but will let you experiment with getting remote > events. > Within the context of the environment that I have available to me (macOS over NFS to a Linux file server), FSEvents is working correctly. I can make changes at any arbitrary place inside of the repo and an event is generated. It's looking like that the Unix domain socket (UDS) file should remain where it is unless fsmonitor.allowRemote is true. If fsmonitor.allowRemote is true then the UDS file can be located in $HOME with the caveat that if there is more than one path to the repo (via hard or sym links) that things might not work as expected. I think that's OK given the experimental nature of the feature. -Eric > Jeff > > > > > base-commit: 9bf691b78cf906751e65d65ba0c6ffdcd9a5a12c > >
On Thu, Aug 18, 2022 at 08:48:23PM +0000, Eric DeCosta via GitGitGadget wrote: > From: Eric DeCosta <edecosta@mathworks.com> Just some comments on the commit message, please see inline. > > Though perhaps not common, there are uses cases where users have large, > network-mounted repos. Having the ability to run fsmonitor against > network paths would benefit those users. I think that we can drop the "Though perhaps not common" - it doesn't add any information about the technical part of the patch. (And that is, what is important) And I personally use network-mounted repos every day ;-) Sone users have large, network-mounted repos. Having the ability to run fsmonitor against network paths would benefit those users. > > As a first step towards enabling fsmonitor to work against > network-mounted repos, a configuration option, 'fsmonitor.allowRemote' > was introduced for Windows. Setting this option to true will override > the default behavior (erroring-out) when a network-mounted repo is > detected by fsmonitor. [] The same option is now introduced for macOS. > > The the added wrinkle being that the Unix domain socket (UDS) file ^^^ ^^^ > used for IPC cannot be created in a network location; instead the > temporary directory is used.
On 8/19/22 2:38 PM, Eric DeCosta wrote: > > >> -----Original Message----- >> From: Jeff Hostetler <git@jeffhostetler.com> >> Sent: Friday, August 19, 2022 12:50 PM >> To: Eric DeCosta via GitGitGadget <gitgitgadget@gmail.com>; >> git@vger.kernel.org >> Cc: Eric DeCosta <edecosta@mathworks.com> >> Subject: Re: [PATCH] fsmonitor: option to allow fsmonitor to run against >> network-mounted repos >> >> >> >> On 8/18/22 4:48 PM, Eric DeCosta via GitGitGadget wrote: >>> From: Eric DeCosta <edecosta@mathworks.com> >>> [...] > Yes, the point is to allow users to try it out. Self-servingly, I have about > 3K users who make heavy use of network-mounted sandboxes on the > three major platforms; all connecting via NFS or SMB to Linux file > servers. Hardly exhaustive, but the file system change notification APIs > (inotify, FSEvents, and ReadDirectoryCHangesW) all seem to work correctly. > Thus my motivation to work on this aspect of git :-) Perfect. I have to be careful here, I don't want to discourage experimentation and testing -- and if you have access to a pool of users who can beat on it, then great let's try it. At the time, I didn't have such a "captive audience"... :-) And I just wanted to be sure that everyone understood that some of these combinations have never been tried.... [...] > As far as multiple client machines mounting the remote repo, I have > doubts that FSMonitor would even see changes made from another > machine. Worth trying out and documenting as needed - might even > be better off being considered as unsupported. Yeah, that's another dimension. (1) is whether we miss events that we generated locally. (2) is whether events get relayed to us from other clients. >> There was a suggestion later in this thread about using a SHA-1 >> or SHA-256 hash of the pathname to avoid the tmp XXXXXX pattern >> and just put the socket in $HOME (and omit the need for the new >> fsmonitor-daemon.ipc FILE completely). This might work, but we >> need to be careful because a user might have hardlinks or symlinks >> locally so there may be more than one unique path to the repo >> on the local system. (It is OK to have more than one daemon >> listening to a repo, just less efficient.) >> > Ah, I see. > >> >> As an interim step, you might try using my original socket code >> plus just the config.allowRemote=true change. And test it on a >> mounted repo where you've converted the .git directory to a .git >> file and moved contents of the .git directory to somewhere local. >> Then the UDS would be created in the local GITDIR instead of on >> the remote system. This won't help any of the sharing cases I >> described above, but will let you experiment with getting remote >> events. >> > Within the context of the environment that I have available to me > (macOS over NFS to a Linux file server), FSEvents is working correctly. > I can make changes at any arbitrary place inside of the repo and an > event is generated. > > It's looking like that the Unix domain socket (UDS) file should remain > where it is unless fsmonitor.allowRemote is true. > > If fsmonitor.allowRemote is true then the UDS file can be located > in $HOME with the caveat that if there is more than one path to > the repo (via hard or sym links) that things might not work as > expected. I think that's OK given the experimental nature of the > feature. Yeah, I'd experiment with moving the UDS to $HOME (assuming you don't remote mount home directories too) and see how it works for you. Later, after we have more experience with it, we can talk about just having it always be in $HOME. Jeff
> As a first step towards enabling fsmonitor to work against > network-mounted repos, a configuration option, 'fsmonitor.allowRemote' > was introduced for Windows. Setting this option to true will override > the default behavior (erroring-out) when a network-mounted repo is > detected by fsmonitor. In order for macOS to have parity with Windows, > the same option is now introduced for macOS. With this merged in, recent CI runs for 'seen' e.g. https://github.com/git/git/actions/runs/2892889122 seems to break macOS jobs, letting them hog CPU forever and exceed 6hr or whatever the limit is. As an experiment I pushed out 'seen' but without this commit (not the entire topic is excluded, the Windows one is still included). As there is nothing specific to macOS between 'next' and 'seen', macOS jobs seem to pass, which is not very surprising. https://github.com/git/git/actions/runs/2896207171 As the patch collected some review comments, I've already marked it in the "What's cooking" draft as expecting a reroll of that step; until that happens, let's keep it out of 'seen'. Thanks.
Hi Junio, On Sat, 20 Aug 2022, Junio C Hamano wrote: > > As a first step towards enabling fsmonitor to work against > > network-mounted repos, a configuration option, 'fsmonitor.allowRemote' > > was introduced for Windows. Setting this option to true will override > > the default behavior (erroring-out) when a network-mounted repo is > > detected by fsmonitor. In order for macOS to have parity with Windows, > > the same option is now introduced for macOS. > > With this merged in, recent CI runs for 'seen' > > e.g. https://github.com/git/git/actions/runs/2892889122 > > seems to break macOS jobs, letting them hog CPU forever and exceed > 6hr or whatever the limit is. > > As an experiment I pushed out 'seen' but without this commit (not > the entire topic is excluded, the Windows one is still included). > As there is nothing specific to macOS between 'next' and 'seen', > macOS jobs seem to pass, which is not very surprising. > > https://github.com/git/git/actions/runs/2896207171 > > As the patch collected some review comments, I've already marked it > in the "What's cooking" draft as expecting a reroll of that step; > until that happens, let's keep it out of 'seen'. It makes sense to keep it out of `seen`, and at the same time I would like to encourage Eric to investigate what causes those time-outs. When toggling timestamps (click on the wheel on the upper right) at https://github.com/git/git/runs/7927812510?check_suite_focus=true#step:4:1774, it can be seen that at close to 1am, t9903 finished, but then nothing happens until twenty past 6am. I've downloaded the raw logs (also available via the wheel on the upper right) to find out which test timed out: $ diff -u \ <(sed -n 's/.*\] \(t[0-9][^ ]*\).*/\1/p' <~/Downloads/17 | sort) \ <(git ls-tree upstream/seen:t | cut -c 54- | grep '^t[0-9].*-.*sh$') --- /dev/fd/63 2022-08-22 14:56:05.510269527 +0200 +++ /dev/fd/62 2022-08-22 14:56:05.510269527 +0200 @@ -794,6 +794,7 @@ t7524-commit-summary.sh t7525-status-rename.sh t7526-commit-pathspec-file.sh +t7527-builtin-fsmonitor.sh t7528-signed-commit-ssh.sh t7600-merge.sh t7601-merge-pull-config.sh @@ -945,6 +946,7 @@ t9812-git-p4-wildcards.sh t9813-git-p4-preserve-users.sh t9814-git-p4-rename.sh +t9815-git-p4-submit-fail.sh t9816-git-p4-locked.sh t9817-git-p4-exclude.sh t9818-git-p4-block.sh @@ -964,5 +966,8 @@ t9832-unshelve.sh t9833-errors.sh t9834-git-p4-file-dir-bug.sh +t9835-git-p4-metadata-encoding-python2.sh +t9836-git-p4-metadata-encoding-python3.sh t9901-git-web--browse.sh +t9902-completion.sh t9903-bash-prompt.sh I have no idea what's up with t98* and t9902, but I would bet that they were somehow "swallowed" by `prove` being terminated, and that the actual test script that times out is t7527. Eric: To investigate, you will want to reproduce the problem on a macOS machine. If you have none available, you could create a temporary branch, heavily edit the CI definition, and push it to GitHub. And by heavy edits I mean something like this: - Remove all non-macOS jobs from `.github/workflows/main.yml` (that means removing all but the `regular` job, removing all but at least one `macos` matrix entry, and removing the the `needs: ci-config` and corresponding `if:` line. - Edit `t/Makefile` to define `T = t7527-builtin-fsmonitor.sh` instead of running all the tests. - Edit `.github/workflows/main.yml` so that the step that causes the time-out has a chance of timing out much sooner (and the subsequent steps then have a chance to upload the relevant logs): https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepstimeout-minutes If this does not shed any light into the issue, please let me know, I have a couple more aces up my sleeve. Ciao, Dscho
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Eric: To investigate, you will want to reproduce the problem on a macOS > machine. If you have none available, you could create a temporary branch, > heavily edit the CI definition, and push it to GitHub. And by heavy edits > I mean something like this: > > - Remove all non-macOS jobs from `.github/workflows/main.yml` (that means > removing all but the `regular` job, removing all but at least one > `macos` matrix entry, and removing the the `needs: ci-config` and > corresponding `if:` line. > > - Edit `t/Makefile` to define `T = t7527-builtin-fsmonitor.sh` instead of > running all the tests. > > - Edit `.github/workflows/main.yml` so that the step that causes the > time-out has a chance of timing out much sooner (and the subsequent > steps then have a chance to upload the relevant logs): > https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepstimeout-minutes > > If this does not shed any light into the issue, please let me know, I have > a couple more aces up my sleeve. Thanks, both.
On 8/22/22 9:22 AM, Johannes Schindelin wrote: > Hi Junio, > > On Sat, 20 Aug 2022, Junio C Hamano wrote: > >>> As a first step towards enabling fsmonitor to work against >>> network-mounted repos, a configuration option, 'fsmonitor.allowRemote' >>> was introduced for Windows. Setting this option to true will override >>> the default behavior (erroring-out) when a network-mounted repo is >>> detected by fsmonitor. In order for macOS to have parity with Windows, >>> the same option is now introduced for macOS. >> >> With this merged in, recent CI runs for 'seen' >> >> e.g. https://github.com/git/git/actions/runs/2892889122 >> >> seems to break macOS jobs, letting them hog CPU forever and exceed >> 6hr or whatever the limit is. >> >> As an experiment I pushed out 'seen' but without this commit (not >> the entire topic is excluded, the Windows one is still included). >> As there is nothing specific to macOS between 'next' and 'seen', >> macOS jobs seem to pass, which is not very surprising. >> >> https://github.com/git/git/actions/runs/2896207171 >> >> As the patch collected some review comments, I've already marked it >> in the "What's cooking" draft as expecting a reroll of that step; >> until that happens, let's keep it out of 'seen'. > > It makes sense to keep it out of `seen`, and at the same time I would like > to encourage Eric to investigate what causes those time-outs. > > When toggling timestamps (click on the wheel on the upper right) at > https://github.com/git/git/runs/7927812510?check_suite_focus=true#step:4:1774, > it can be seen that at close to 1am, t9903 finished, but then nothing > happens until twenty past 6am. > > I've downloaded the raw logs (also available via the wheel on the upper > right) to find out which test timed out: > > $ diff -u \ > <(sed -n 's/.*\] \(t[0-9][^ ]*\).*/\1/p' <~/Downloads/17 | sort) \ > <(git ls-tree upstream/seen:t | cut -c 54- | grep '^t[0-9].*-.*sh$') > > --- /dev/fd/63 2022-08-22 14:56:05.510269527 +0200 > +++ /dev/fd/62 2022-08-22 14:56:05.510269527 +0200 > @@ -794,6 +794,7 @@ > t7524-commit-summary.sh > t7525-status-rename.sh > t7526-commit-pathspec-file.sh > +t7527-builtin-fsmonitor.sh > t7528-signed-commit-ssh.sh > t7600-merge.sh > t7601-merge-pull-config.sh > @@ -945,6 +946,7 @@ > t9812-git-p4-wildcards.sh > t9813-git-p4-preserve-users.sh > t9814-git-p4-rename.sh > +t9815-git-p4-submit-fail.sh > t9816-git-p4-locked.sh > t9817-git-p4-exclude.sh > t9818-git-p4-block.sh > @@ -964,5 +966,8 @@ > t9832-unshelve.sh > t9833-errors.sh > t9834-git-p4-file-dir-bug.sh > +t9835-git-p4-metadata-encoding-python2.sh > +t9836-git-p4-metadata-encoding-python3.sh > t9901-git-web--browse.sh > +t9902-completion.sh > t9903-bash-prompt.sh > > I have no idea what's up with t98* and t9902, but I would bet that they > were somehow "swallowed" by `prove` being terminated, and that the > actual test script that times out is t7527. > > Eric: To investigate, you will want to reproduce the problem on a macOS > machine. If you have none available, you could create a temporary branch, > heavily edit the CI definition, and push it to GitHub. And by heavy edits > I mean something like this: > > - Remove all non-macOS jobs from `.github/workflows/main.yml` (that means > removing all but the `regular` job, removing all but at least one > `macos` matrix entry, and removing the the `needs: ci-config` and > corresponding `if:` line. > > - Edit `t/Makefile` to define `T = t7527-builtin-fsmonitor.sh` instead of > running all the tests. > > - Edit `.github/workflows/main.yml` so that the step that causes the > time-out has a chance of timing out much sooner (and the subsequent > steps then have a chance to upload the relevant logs): > https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepstimeout-minutes I would also set GIT_TRACE_FSMONITOR and GIT_TRACE2_PERF (on both daemon and client sides of the tests) and capture the logs and try to figure out what is happening. I suspect that this testing should wait until you redo the patch to remove the tmp file stuff and just move the socket into $HOME as we talked about earlier. Jeff
> I would also set GIT_TRACE_FSMONITOR and GIT_TRACE2_PERF (on both daemon > and client sides of the tests) and capture the logs and try to figure > out what is happening. > > I suspect that this testing should wait until you redo the patch to > remove the tmp file stuff and just move the socket into $HOME as we > talked about earlier. > > Jeff All tests are passing now with the new patch. By default the socket is written into the original location (.git directory) unless 'fsmonitor.allowRemote' is true. Only then is $HOME used and if $HOME proves unsuitable the user can override that by setting 'fsmonitor.sockerDir' to some valid, local directory. -Eric
diff --git a/compat/fsmonitor/fsm-settings-darwin.c b/compat/fsmonitor/fsm-settings-darwin.c index efc732c0f31..9e2ea3b90cc 100644 --- a/compat/fsmonitor/fsm-settings-darwin.c +++ b/compat/fsmonitor/fsm-settings-darwin.c @@ -2,10 +2,28 @@ #include "config.h" #include "repository.h" #include "fsmonitor-settings.h" +#include "fsmonitor-ipc.h" #include "fsmonitor.h" #include <sys/param.h> #include <sys/mount.h> +/* + * Check if monitoring remote working directories is allowed. + * + * By default, monitoring remote working directories is + * disabled. Users may override this behavior in enviroments where + * they have proper support. + */ +static int check_config_allowremote(struct repository *r) +{ + int allow; + + if (!repo_config_get_bool(r, "fsmonitor.allowremote", &allow)) + return allow; + + return -1; /* fsmonitor.allowremote not set */ +} + /* * [1] Remote working directories are problematic for FSMonitor. * @@ -27,24 +45,22 @@ * In theory, the above issues need to be addressed whether we are * using the Hook or IPC API. * + * So (for now at least), mark remote working directories as + * incompatible by default. + * * For the builtin FSMonitor, we create the Unix domain socket for the - * IPC in the .git directory. If the working directory is remote, - * then the socket will be created on the remote file system. This - * can fail if the remote file system does not support UDS file types - * (e.g. smbfs to a Windows server) or if the remote kernel does not - * allow a non-local process to bind() the socket. (These problems - * could be fixed by moving the UDS out of the .git directory and to a - * well-known local directory on the client machine, but care should - * be taken to ensure that $HOME is actually local and not a managed - * file share.) + * IPC in the temporary directory. If the temporary directory is + * remote, then the socket will be created on the remote file system. + * This can fail if the remote file system does not support UDS file + * types (e.g. smbfs to a Windows server) or if the remote kernel does + * not allow a non-local process to bind() the socket. * - * So (for now at least), mark remote working directories as - * incompatible. + * Therefore remote UDS locations are marked as incompatible. * * * [2] FAT32 and NTFS working directories are problematic too. * - * The builtin FSMonitor uses a Unix domain socket in the .git + * The builtin FSMonitor uses a Unix domain socket in the temporary * directory for IPC. These Windows drive formats do not support * Unix domain sockets, so mark them as incompatible for the daemon. * @@ -65,6 +81,39 @@ static enum fsmonitor_reason check_volume(struct repository *r) "statfs('%s') [type 0x%08x][flags 0x%08x] '%s'", r->worktree, fs.f_type, fs.f_flags, fs.f_fstypename); + if (!(fs.f_flags & MNT_LOCAL)) { + switch (check_config_allowremote(r)) { + case 0: /* config overrides and disables */ + return FSMONITOR_REASON_REMOTE; + case 1: /* config overrides and enables */ + return FSMONITOR_REASON_OK; + default: + break; /* config has no opinion */ + } + + return FSMONITOR_REASON_REMOTE; + } + + return FSMONITOR_REASON_OK; +} + +static enum fsmonitor_reason check_uds_volume(void) +{ + struct statfs fs; + const char *path = fsmonitor_ipc__get_path(); + + if (statfs(path, &fs) == -1) { + int saved_errno = errno; + trace_printf_key(&trace_fsmonitor, "statfs('%s') failed: %s", + path, strerror(saved_errno)); + errno = saved_errno; + return FSMONITOR_REASON_ERROR; + } + + trace_printf_key(&trace_fsmonitor, + "statfs('%s') [type 0x%08x][flags 0x%08x] '%s'", + path, fs.f_type, fs.f_flags, fs.f_fstypename); + if (!(fs.f_flags & MNT_LOCAL)) return FSMONITOR_REASON_REMOTE; @@ -85,5 +134,9 @@ enum fsmonitor_reason fsm_os__incompatible(struct repository *r) if (reason != FSMONITOR_REASON_OK) return reason; + reason = check_uds_volume(); + if (reason != FSMONITOR_REASON_OK) + return reason; + return FSMONITOR_REASON_OK; } diff --git a/fsmonitor-ipc.c b/fsmonitor-ipc.c index 789e7397baa..6e9b40a03d5 100644 --- a/fsmonitor-ipc.c +++ b/fsmonitor-ipc.c @@ -4,6 +4,7 @@ #include "fsmonitor-ipc.h" #include "run-command.h" #include "strbuf.h" +#include "tempfile.h" #include "trace2.h" #ifndef HAVE_FSMONITOR_DAEMON_BACKEND @@ -47,7 +48,51 @@ int fsmonitor_ipc__is_supported(void) return 1; } -GIT_PATH_FUNC(fsmonitor_ipc__get_path, "fsmonitor--daemon.ipc") +GIT_PATH_FUNC(fsmonitor_ipc__get_pathfile, "fsmonitor--daemon.ipc") + +static char *gen_ipc_file(void) +{ + char *retval = NULL; + struct tempfile *ipc; + + const char *ipc_file = fsmonitor_ipc__get_pathfile(); + FILE *fp = fopen(ipc_file, "w"); + + if (!fp) + die_errno("error opening '%s'", ipc_file); + ipc = mks_tempfile_t("fsmonitor_ipc_XXXXXX"); + strbuf_write(&ipc->filename, fp); + fclose(fp); + retval = strbuf_detach(&ipc->filename, NULL); + strbuf_release(&ipc->filename); + return retval; +} + +const char *fsmonitor_ipc__get_path(void) +{ + char *retval = NULL; + struct strbuf sb = STRBUF_INIT; + + const char *ipc_file = fsmonitor_ipc__get_pathfile(); + FILE *fp = fopen(ipc_file, "r"); + + if (!fp) { + return gen_ipc_file(); + } else { + strbuf_read(&sb, fileno(fp), 0); + fclose(fp); + fp = fopen(sb.buf, "r"); + if (!fp) { /* generate new file */ + if (unlink(ipc_file) < 0) + die_errno("could not remove '%s'", ipc_file); + return gen_ipc_file(); + } + fclose(fp); + retval = strbuf_detach(&sb, NULL); + strbuf_release(&sb); + return retval; + } +} enum ipc_active_state fsmonitor_ipc__get_state(void) { diff --git a/fsmonitor-ipc.h b/fsmonitor-ipc.h index b6a7067c3af..63277dea39e 100644 --- a/fsmonitor-ipc.h +++ b/fsmonitor-ipc.h @@ -18,6 +18,12 @@ int fsmonitor_ipc__is_supported(void); */ const char *fsmonitor_ipc__get_path(void); +/* + * Returns the pathname to the file that contains the pathname to the + * IPC named pipe or Unix domain socket. + */ +const char *fsmonitor_ipc__get_pathfile(void); + /* * Try to determine whether there is a `git-fsmonitor--daemon` process * listening on the IPC pipe/socket.