Message ID | 6efdc6ed74ec9224d93a1b88ff8be85d533cb30f.1664048782.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | fsmonitor: option to allow fsmonitor to run against network-mounted repos | expand |
On Sat, Sep 24 2022, Eric DeCosta via GitGitGadget wrote: > From: Eric DeCosta <edecosta@mathworks.com> > [...] > + state.alias.alias = NULL; > + state.alias.points_to = NULL; > + if (fsmonitor__get_alias(state.path_worktree_watch.buf, &state.alias)) { > + err = error(_("could not get worktree alias")); > + goto done; > + } As we can see here this is in the one-off setup code... > +int fsmonitor__get_alias(const char *path, struct alias_info *info) > +{ > + DIR * dir; > + int read; > + int retval; > + struct dirent *de; > + struct strbuf alias; > + struct strbuf points_to; ...more of a code clarity comment than anything, else, but... > + > + retval = 0; > + dir = opendir("/"); > + if (!dir) > + return -1; > + > + strbuf_init(&alias, 256); > + strbuf_init(&points_to, MAXPATHLEN); ...can't we just use the STRBUF_INIT macro here instead? most paths are nowhere near MAXPATHLEN, but more importantly we try to avoid these sorts of memory micro-managements except for hot codepaths. In this case it's just the one-off setup of fsmonitor, isn't it? So just using the default allocation pattern seems worthwhile, and will save e.g. anyone grepping for MAXPATHLEN looking for bugs (the MAXPATHLEN is sometimes not the actual maximum pathlen). > + > + while ((de = readdir(dir)) != NULL) { > + strbuf_reset(&alias); > + strbuf_addch(&alias, '/'); > + strbuf_add(&alias, de->d_name, strlen(de->d_name)); > + > + read = readlink(alias.buf, points_to.buf, MAXPATHLEN); > + if (read > 0) { > + strbuf_setlen(&points_to, read); > + if ((strncmp(points_to.buf, path, points_to.len) == 0) We usually do (!strcmp()), not strcmp() == 0, ditto strncmp. See CodingGuidelines. > + done: Nit: labels shouldn't be indented. > + closedir(dir); We checked the opendir() return value, why not closedir() too? > + if ((strncmp(info->alias, path, len) == 0) ditto !foo() v.s. foo() == 0. > + && path[len] == '/') { > + struct strbuf tmp; > + const char *remainder = path + len; > + int ptr_len = strlen(info->points_to); > + int rem_len = strlen(remainder); Make these s/int/size_t/. > + > + strbuf_init(&tmp, ptr_len + rem_len); And use st_add() here instead of " + ". I don't think it'll overflow, but it's good to guard overflows out of habit... > + strbuf_add(&tmp, info->points_to, ptr_len); Earlier you constructed a strbuf, and then strbuf_detached() it into this new "struct alias_info" you made. And now we're having to strlen() that to get the lenght that we knew earlier? Can't we just make the member a "struct strbuf" instead? Maybe not, I have not reviewed that aspect carefully...
On Sat, Sep 24 2022, Eric DeCosta via GitGitGadget wrote: > From: Eric DeCosta <edecosta@mathworks.com> ..one aspect I missed... > + state.alias.alias = NULL; > + state.alias.points_to = NULL; > + if (fsmonitor__get_alias(state.path_worktree_watch.buf, &state.alias)) { > + err = error(_("could not get worktree alias")); > + goto done; Okey, it errored and we call error() to say it didn't work, good so far, but... > +int fsmonitor__get_alias(const char *path, struct alias_info *info) > +{ > + DIR * dir; > + int read; > + int retval; ...we could just... > + struct dirent *de; > + struct strbuf alias; > + struct strbuf points_to; > + > + retval = 0; ...have initialized that above if we do it unconditionally, but more on this below... > + dir = opendir("/"); > + if (!dir) > + return -1; Here in the actual implementation, which looking at the end-state we *only* end up calling from that one caller we could have called error_errno() to get a better message, but didn't. I think much better would be to skip that above entirely, or keep it you want two errors, but then just have the more meaningful error_errno() here, where we're closer to the error, and can report a better one. Of course we might sometimes have a good error, and sometimes a bad one, but...(continued below) > + > + strbuf_init(&alias, 256); > + strbuf_init(&points_to, MAXPATHLEN); > + > + while ((de = readdir(dir)) != NULL) { > + strbuf_reset(&alias); > + strbuf_addch(&alias, '/'); > + strbuf_add(&alias, de->d_name, strlen(de->d_name)); > + > + read = readlink(alias.buf, points_to.buf, MAXPATHLEN); I think a: if (!read) BUG("got 0 from readlink?"); Or something would be a good paranoia addition, as you're technically relying on... > + if (read > 0) { > + strbuf_setlen(&points_to, read); > + if ((strncmp(points_to.buf, path, points_to.len) == 0) > + && path[points_to.len] == '/') { > + info->alias = strbuf_detach(&alias, NULL); > + info->points_to = strbuf_detach(&points_to, NULL); > + trace_printf_key(&trace_fsmonitor, > + "Found alias for '%s' : '%s' -> '%s'", > + path, info->alias, info->points_to); > + retval = 0; > + goto done; > + } > + } else if (errno != EINVAL) { /* Something other than not a link */ ...the possibility that we return 0 but a stale errno happens to be set, I don't think it'll happen in practice and that it always returned -1 if we get here, but being strict with calling syscalls is generally good. > + trace_printf_key(&trace_fsmonitor, "Error %s", strerror(errno)); (continued from above)..here we see the only codepath that sets retval != 0, > + retval = -1; Here we could have just called error_errno() instead. > + * The caller owns the storage that the returned string occupies and > + * is responsible for releasing it with `free(3)` when done. nit: we could just put a full stop after "it" and skip the rest. I.e. trust that the reader knows that allocated memory is freed with free(). > + */ > +char *fsmonitor__resolve_alias(const char *path, > + const struct alias_info *info); > + > + nit: extra whitespace at end of file. > #endif
> -----Original Message----- > From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > Sent: Monday, September 26, 2022 11:16 AM > To: Eric DeCosta via GitGitGadget <gitgitgadget@gmail.com> > Cc: git@vger.kernel.org; Jeff Hostetler <git@jeffhostetler.com>; Eric Sunshine > <sunshine@sunshineco.com>; Torsten Bögershausen <tboegi@web.de>; > Ramsay Jones <ramsay@ramsayjones.plus.com>; Johannes Schindelin > <Johannes.Schindelin@gmx.de>; Eric DeCosta <edecosta@mathworks.com> > Subject: Re: [PATCH v12 4/6] fsmonitor: deal with synthetic firmlinks on > macOS > > > On Sat, Sep 24 2022, Eric DeCosta via GitGitGadget wrote: > > > From: Eric DeCosta <edecosta@mathworks.com> [...] > > + state.alias.alias = NULL; > > + state.alias.points_to = NULL; > > + if (fsmonitor__get_alias(state.path_worktree_watch.buf, > &state.alias)) { > > + err = error(_("could not get worktree alias")); > > + goto done; > > + } > > As we can see here this is in the one-off setup code... > > > +int fsmonitor__get_alias(const char *path, struct alias_info *info) { > > + DIR * dir; > > + int read; > > + int retval; > > + struct dirent *de; > > + struct strbuf alias; > > + struct strbuf points_to; > > ...more of a code clarity comment than anything, else, but... > > > + > > + retval = 0; > > + dir = opendir("/"); > > + if (!dir) > > + return -1; > > + > > + strbuf_init(&alias, 256); > > + strbuf_init(&points_to, MAXPATHLEN); > > > ...can't we just use the STRBUF_INIT macro here instead? most paths are > nowhere near MAXPATHLEN, but more importantly we try to avoid these > sorts of memory micro-managements except for hot codepaths. > > In this case it's just the one-off setup of fsmonitor, isn't it? So just using the > default allocation pattern seems worthwhile, and will save e.g. anyone > grepping for MAXPATHLEN looking for bugs (the MAXPATHLEN is sometimes > not the actual maximum pathlen). > OK, makes sense. > > + > > + while ((de = readdir(dir)) != NULL) { > > + strbuf_reset(&alias); > > + strbuf_addch(&alias, '/'); > > + strbuf_add(&alias, de->d_name, strlen(de->d_name)); > > + > > + read = readlink(alias.buf, points_to.buf, MAXPATHLEN); > > + if (read > 0) { > > + strbuf_setlen(&points_to, read); > > + if ((strncmp(points_to.buf, path, points_to.len) == 0) > > We usually do (!strcmp()), not strcmp() == 0, ditto strncmp. See > CodingGuidelines. > > + done: > Fixed. > Nit: labels shouldn't be indented. > Fixed > > + closedir(dir); > > We checked the opendir() return value, why not closedir() too? > OK, will do. > > + if ((strncmp(info->alias, path, len) == 0) > > ditto !foo() v.s. foo() == 0. > > > + && path[len] == '/') { > > + struct strbuf tmp; > > + const char *remainder = path + len; > > + int ptr_len = strlen(info->points_to); > > + int rem_len = strlen(remainder); > > Make these s/int/size_t/. > > > + > > + strbuf_init(&tmp, ptr_len + rem_len); > > And use st_add() here instead of " + ". I don't think it'll overflow, but it's good > to guard overflows out of habit... > Sure, just strbuf_add(). Then I don't need ptr_len or rem_len either. > > + strbuf_add(&tmp, info->points_to, ptr_len); > > Earlier you constructed a strbuf, and then strbuf_detached() it into this new > "struct alias_info" you made. And now we're having to strlen() that to get the > lenght that we knew earlier? > > Can't we just make the member a "struct strbuf" instead? Maybe not, I have > not reviewed that aspect carefully... > Certainly could do that. -Eric
diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c index 0123fc33ed2..56fcd1c2baa 100644 --- a/builtin/fsmonitor--daemon.c +++ b/builtin/fsmonitor--daemon.c @@ -3,6 +3,7 @@ #include "parse-options.h" #include "fsmonitor.h" #include "fsmonitor-ipc.h" +#include "fsmonitor-path-utils.h" #include "compat/fsmonitor/fsm-health.h" #include "compat/fsmonitor/fsm-listen.h" #include "fsmonitor--daemon.h" @@ -1282,6 +1283,13 @@ static int fsmonitor_run_daemon(void) strbuf_addstr(&state.path_worktree_watch, absolute_path(get_git_work_tree())); state.nr_paths_watching = 1; + state.alias.alias = NULL; + state.alias.points_to = NULL; + if (fsmonitor__get_alias(state.path_worktree_watch.buf, &state.alias)) { + err = error(_("could not get worktree alias")); + goto done; + } + /* * We create and delete cookie files somewhere inside the .git * directory to help us keep sync with the file system. If diff --git a/compat/fsmonitor/fsm-listen-darwin.c b/compat/fsmonitor/fsm-listen-darwin.c index 8e208e8289e..daeee4e465c 100644 --- a/compat/fsmonitor/fsm-listen-darwin.c +++ b/compat/fsmonitor/fsm-listen-darwin.c @@ -26,6 +26,7 @@ #include "fsmonitor.h" #include "fsm-listen.h" #include "fsmonitor--daemon.h" +#include "fsmonitor-path-utils.h" struct fsm_listen_data { @@ -198,8 +199,9 @@ static void fsevent_callback(ConstFSEventStreamRef streamRef, struct string_list cookie_list = STRING_LIST_INIT_DUP; const char *path_k; const char *slash; - int k; + char *resolved = NULL; struct strbuf tmp = STRBUF_INIT; + int k; /* * Build a list of all filesystem changes into a private/local @@ -209,7 +211,12 @@ static void fsevent_callback(ConstFSEventStreamRef streamRef, /* * On Mac, we receive an array of absolute paths. */ - path_k = paths[k]; + free(resolved); + resolved = fsmonitor__resolve_alias(paths[k], &state->alias); + if (resolved) + path_k = resolved; + else + path_k = paths[k]; /* * If you want to debug FSEvents, log them to GIT_TRACE_FSMONITOR. @@ -238,6 +245,7 @@ static void fsevent_callback(ConstFSEventStreamRef streamRef, fsmonitor_force_resync(state); fsmonitor_batch__free_list(batch); string_list_clear(&cookie_list, 0); + batch = NULL; /* * We assume that any events that we received @@ -360,12 +368,14 @@ static void fsevent_callback(ConstFSEventStreamRef streamRef, } } + free(resolved); fsmonitor_publish(state, batch, &cookie_list); string_list_clear(&cookie_list, 0); strbuf_release(&tmp); return; force_shutdown: + free(resolved); fsmonitor_batch__free_list(batch); string_list_clear(&cookie_list, 0); diff --git a/compat/fsmonitor/fsm-path-utils-darwin.c b/compat/fsmonitor/fsm-path-utils-darwin.c index 067cbe6990a..13807f58e95 100644 --- a/compat/fsmonitor/fsm-path-utils-darwin.c +++ b/compat/fsmonitor/fsm-path-utils-darwin.c @@ -1,5 +1,8 @@ #include "fsmonitor.h" #include "fsmonitor-path-utils.h" +#include <dirent.h> +#include <errno.h> +#include <fcntl.h> #include <sys/param.h> #include <sys/mount.h> @@ -38,3 +41,92 @@ int fsmonitor__is_fs_remote(const char *path) return -1; return fs.is_remote; } + +/* + * Scan the root directory for synthetic firmlinks that when resolved + * are a prefix of the path, stopping at the first one found. + * + * Some information about firmlinks and synthetic firmlinks: + * https://eclecticlight.co/2020/01/23/catalina-boot-volumes/ + * + * macOS no longer allows symlinks in the root directory; any link found + * there is therefore a synthetic firmlink. + * + * If this function gets called often, will want to cache all the firmlink + * information, but for now there is only one caller of this function. + * + * If there is more than one alias for the path, that is another + * matter altogether. + */ +int fsmonitor__get_alias(const char *path, struct alias_info *info) +{ + DIR * dir; + int read; + int retval; + struct dirent *de; + struct strbuf alias; + struct strbuf points_to; + + retval = 0; + dir = opendir("/"); + if (!dir) + return -1; + + strbuf_init(&alias, 256); + strbuf_init(&points_to, MAXPATHLEN); + + while ((de = readdir(dir)) != NULL) { + strbuf_reset(&alias); + strbuf_addch(&alias, '/'); + strbuf_add(&alias, de->d_name, strlen(de->d_name)); + + read = readlink(alias.buf, points_to.buf, MAXPATHLEN); + if (read > 0) { + strbuf_setlen(&points_to, read); + if ((strncmp(points_to.buf, path, points_to.len) == 0) + && path[points_to.len] == '/') { + info->alias = strbuf_detach(&alias, NULL); + info->points_to = strbuf_detach(&points_to, NULL); + trace_printf_key(&trace_fsmonitor, + "Found alias for '%s' : '%s' -> '%s'", + path, info->alias, info->points_to); + retval = 0; + goto done; + } + } else if (errno != EINVAL) { /* Something other than not a link */ + trace_printf_key(&trace_fsmonitor, "Error %s", strerror(errno)); + retval = -1; + goto done; + } + } + + done: + closedir(dir); + strbuf_release(&alias); + strbuf_release(&points_to); + return retval; +} + +char *fsmonitor__resolve_alias(const char *path, + const struct alias_info *info) +{ + int len = info->alias ? strlen(info->alias) : 0; + + if (!len) + return NULL; + + if ((strncmp(info->alias, path, len) == 0) + && path[len] == '/') { + struct strbuf tmp; + const char *remainder = path + len; + int ptr_len = strlen(info->points_to); + int rem_len = strlen(remainder); + + strbuf_init(&tmp, ptr_len + rem_len); + strbuf_add(&tmp, info->points_to, ptr_len); + strbuf_add(&tmp, remainder, rem_len); + return strbuf_detach(&tmp, NULL); + } + + return NULL; +} diff --git a/compat/fsmonitor/fsm-path-utils-win32.c b/compat/fsmonitor/fsm-path-utils-win32.c index a90b8f7925b..0d95bbb416f 100644 --- a/compat/fsmonitor/fsm-path-utils-win32.c +++ b/compat/fsmonitor/fsm-path-utils-win32.c @@ -126,3 +126,20 @@ int fsmonitor__is_fs_remote(const char *path) return -1; return fs.is_remote; } + +/* + * No-op for now. + */ +int fsmonitor__get_alias(const char *path, struct alias_info *info) +{ + return 0; +} + +/* + * No-op for now. + */ +char *fsmonitor__resolve_alias(const char *path, + const struct alias_info *info) +{ + return NULL; +} diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h index 2102a5c9ff5..e24838f9a86 100644 --- a/fsmonitor--daemon.h +++ b/fsmonitor--daemon.h @@ -8,6 +8,7 @@ #include "run-command.h" #include "simple-ipc.h" #include "thread-utils.h" +#include "fsmonitor-path-utils.h" struct fsmonitor_batch; struct fsmonitor_token_data; @@ -43,6 +44,7 @@ struct fsmonitor_daemon_state { struct strbuf path_worktree_watch; struct strbuf path_gitdir_watch; + struct alias_info alias; int nr_paths_watching; struct fsmonitor_token_data *current_token_data; @@ -59,6 +61,7 @@ struct fsmonitor_daemon_state { struct ipc_server_data *ipc_server_data; struct strbuf path_ipc; + }; /* diff --git a/fsmonitor-path-utils.h b/fsmonitor-path-utils.h index e48592887e7..50ef37e57bb 100644 --- a/fsmonitor-path-utils.h +++ b/fsmonitor-path-utils.h @@ -1,6 +1,14 @@ #ifndef FSM_PATH_UTILS_H #define FSM_PATH_UTILS_H +#include "strbuf.h" + +struct alias_info +{ + char *alias; + char *points_to; +}; + struct fs_info { int is_remote; char *typename; @@ -20,4 +28,32 @@ int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info); */ int fsmonitor__is_fs_remote(const char *path); +/* + * Get the alias in given path, if any. + * + * Sets alias to the first alias that matches any part of the path. + * + * If an alias is found, info.alias and info.points_to are set to the + * found mapping. + * + * Returns -1 on error, 0 otherwise. + * + * The caller owns the storage that is occupied by set info.alias and + * info.points_to and is responsible for releasing it with `free(3)` + * when done. + */ +int fsmonitor__get_alias(const char *path, struct alias_info *info); + +/* + * Resolve the path against the given alias. + * + * Returns the resolved path if there is one, NULL otherwise. + * + * The caller owns the storage that the returned string occupies and + * is responsible for releasing it with `free(3)` when done. + */ +char *fsmonitor__resolve_alias(const char *path, + const struct alias_info *info); + + #endif