Message ID | 5bba5eb3d1bd172f09fdf6eb2e9b8ac4dd7f940f.1625150864.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Builtin FSMonitor Feature | expand |
On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote: > From: Jeff Hostetler <jeffhost@microsoft.com> > > Teach the win32 backend to register a watch on the working tree > root directory (recursively). Also watch the <gitdir> if it is > not inside the working tree. And to collect path change notifications > into batches and publish. > > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> > --- > compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++ <bikeshed mode> Spying on the early history of this (looking for the Linux backend) I saw that at some point we had just compat/fsmonitor/linux.c, and presumably some of compat/fsmonitor/{windows,win32,macos,darwin}.c. At some point those filenames became much much longer. I've noticed you tend to prefer really long file and function names, e.g. your borrowed daemonize() became spawn_background_fsmonitor_daemon(), I think aiming for shorter filenames & function names helps, e.g. these long names widen diffstats, and many people who hack on the code stick religiously to 80 character width terminals.
Hi Jeff, On Thu, 1 Jul 2021, Jeff Hostetler via GitGitGadget wrote: Jeff Hostetler <jeffhost@microsoft.com> the win32 backend to register a watch on the working tree irectory (recursively). Also watch the <gitdir> if it is side the working tree. And to collect path change notifications atches and publish. -off-by: Jeff Hostetler <jeffhost@microsoft.com> t/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++ e changed, 530 insertions(+) > diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c > index 880446b49e3..d707d47a0d7 100644 > --- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c > +++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c > @@ -2,20 +2,550 @@ > + [...] > + > +static struct one_watch *create_watch(struct fsmonitor_daemon_state *state, > + const char *path) > +{ > + struct one_watch *watch = NULL; > + DWORD desired_access = FILE_LIST_DIRECTORY; > + DWORD share_mode = > + FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE; > + HANDLE hDir; > + > + hDir = CreateFileA(path, > + desired_access, share_mode, NULL, OPEN_EXISTING, > + FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED, > + NULL); The `*A()` family of Win32 API functions disagree with Git in one very interesting aspect: Git always assumes UTF-8, while e.g. `CreateFileA()` will use the current Win32 locale to internally transform to wide characters and then call `CreateFileW()`. This poses no problem when your locale is US American and your paths contain no non-ASCII characters. In the Git for Windows bug tracker, it was reported that it _does_ cause problems when venturing outside such a cozy scenario (for full details, see https://github.com/git-for-windows/git/issues/3262) I need this (and merged it before starting the process to release Git for Windows v2.32.0(2)) to fix that (could I ask you to integrate this in case a re-roll will become necessary?): -- snipsnap -- From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Mon, 5 Jul 2021 13:51:05 +0200 Subject: [PATCH] fixup! fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Let's keep avoiding the `*A()` family of Win32 API functions because they are susceptible to incoherent encoding problems. In Git for Windows, we always assume paths to be UTF-8 encoded. Let's use the dedicated helper to convert such a path to the wide character version, and then use the `*W()` function instead. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/fsmonitor/fsmonitor-fs-listen-win32.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c index ba087b292df..3b42ab311d9 100644 --- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c +++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c @@ -111,8 +111,14 @@ static struct one_watch *create_watch(struct fsmonitor_daemon_state *state, DWORD share_mode = FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE; HANDLE hDir; + wchar_t wpath[MAX_PATH]; - hDir = CreateFileA(path, + if (xutftowcs_path(wpath, path) < 0) { + error(_("could not convert to wide characters: '%s'"), path); + return NULL; + } + + hDir = CreateFileW(wpath, desired_access, share_mode, NULL, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED, NULL); -- 2.32.0.windows.1.15.gf1590a75e2d
On 7/6/21 3:09 PM, Johannes Schindelin wrote: > Hi Jeff, > > > On Thu, 1 Jul 2021, Jeff Hostetler via GitGitGadget wrote: > > Jeff Hostetler <jeffhost@microsoft.com> > > the win32 backend to register a watch on the working tree > irectory (recursively). Also watch the <gitdir> if it is > side the working tree. And to collect path change notifications > atches and publish. > > -off-by: Jeff Hostetler <jeffhost@microsoft.com> > > t/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++ > e changed, 530 insertions(+) > >> diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c >> index 880446b49e3..d707d47a0d7 100644 >> --- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c >> +++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c >> @@ -2,20 +2,550 @@ >> + [...] >> + >> +static struct one_watch *create_watch(struct fsmonitor_daemon_state *state, >> + const char *path) >> +{ >> + struct one_watch *watch = NULL; >> + DWORD desired_access = FILE_LIST_DIRECTORY; >> + DWORD share_mode = >> + FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE; >> + HANDLE hDir; >> + >> + hDir = CreateFileA(path, >> + desired_access, share_mode, NULL, OPEN_EXISTING, >> + FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED, >> + NULL); > > The `*A()` family of Win32 API functions disagree with Git in one very > interesting aspect: Git always assumes UTF-8, while e.g. `CreateFileA()` > will use the current Win32 locale to internally transform to wide > characters and then call `CreateFileW()`. > > This poses no problem when your locale is US American and your paths > contain no non-ASCII characters. > > In the Git for Windows bug tracker, it was reported that it _does_ cause > problems when venturing outside such a cozy scenario (for full details, > see https://github.com/git-for-windows/git/issues/3262) > > I need this (and merged it before starting the process to release Git for > Windows v2.32.0(2)) to fix that (could I ask you to integrate this in case > a re-roll will become necessary?): > > -- snipsnap -- > From: Johannes Schindelin <johannes.schindelin@gmx.de> > Date: Mon, 5 Jul 2021 13:51:05 +0200 > Subject: [PATCH] fixup! fsmonitor-fs-listen-win32: implement FSMonitor backend > on Windows > > Let's keep avoiding the `*A()` family of Win32 API functions because > they are susceptible to incoherent encoding problems. In Git for > Windows, we always assume paths to be UTF-8 encoded. Let's use the > dedicated helper to convert such a path to the wide character version, > and then use the `*W()` function instead. > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > --- > compat/fsmonitor/fsmonitor-fs-listen-win32.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c > index ba087b292df..3b42ab311d9 100644 > --- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c > +++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c > @@ -111,8 +111,14 @@ static struct one_watch *create_watch(struct fsmonitor_daemon_state *state, > DWORD share_mode = > FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE; > HANDLE hDir; > + wchar_t wpath[MAX_PATH]; > > - hDir = CreateFileA(path, > + if (xutftowcs_path(wpath, path) < 0) { > + error(_("could not convert to wide characters: '%s'"), path); > + return NULL; > + } > + > + hDir = CreateFileW(wpath, > desired_access, share_mode, NULL, OPEN_EXISTING, > FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED, > NULL); > -- > 2.32.0.windows.1.15.gf1590a75e2d > Thanks for the heads up. I'll pull this into my next release. Jeff
On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote: > >> From: Jeff Hostetler <jeffhost@microsoft.com> >> >> Teach the win32 backend to register a watch on the working tree >> root directory (recursively). Also watch the <gitdir> if it is >> not inside the working tree. And to collect path change notifications >> into batches and publish. >> >> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> >> --- >> compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++ > > <bikeshed mode> Spying on the early history of this (looking for the > Linux backend) I saw that at some point we had just > compat/fsmonitor/linux.c, and presumably some of > compat/fsmonitor/{windows,win32,macos,darwin}.c. > > At some point those filenames became much much longer. > Once upon a time having "foo/bar/win32.c" and "abc/def/win32.c" would cause confusion in the debugger (I've long since forgotten which). Breaking at win32.c:30 was no longer unique. Also, if the Makefile sends all .o's to the root directory or a unified OBJS directory rather than to the subdir containing the .c, then we have another issue during linking... So, having been burned too many times, I prefer to make source filenames unique when possible. > I've noticed you tend to prefer really long file and function names, > e.g. your borrowed daemonize() became > spawn_background_fsmonitor_daemon(), I think aiming for shorter > filenames & function names helps, e.g. these long names widen diffstats, > and many people who hack on the code stick religiously to 80 character > width terminals. > I prefer self-documenting code. Jeff
On Tue, Jul 13 2021, Jeff Hostetler wrote: > On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote: >> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote: >> >>> From: Jeff Hostetler <jeffhost@microsoft.com> >>> >>> Teach the win32 backend to register a watch on the working tree >>> root directory (recursively). Also watch the <gitdir> if it is >>> not inside the working tree. And to collect path change notifications >>> into batches and publish. >>> >>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> >>> --- >>> compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++ >> <bikeshed mode> Spying on the early history of this (looking for the >> Linux backend) I saw that at some point we had just >> compat/fsmonitor/linux.c, and presumably some of >> compat/fsmonitor/{windows,win32,macos,darwin}.c. >> At some point those filenames became much much longer. >> > > Once upon a time having "foo/bar/win32.c" and "abc/def/win32.c" > would cause confusion in the debugger (I've long since forgotten > which). Breaking at win32.c:30 was no longer unique. > > Also, if the Makefile sends all .o's to the root directory or a > unified OBJS directory rather than to the subdir containing the .c, > then we have another issue during linking... > > So, having been burned too many times, I prefer to make source > filenames unique when possible. A much shorter name like compat/fsmonitor/fsmon-win32.c would achieve that goal. >> I've noticed you tend to prefer really long file and function names, >> e.g. your borrowed daemonize() became >> spawn_background_fsmonitor_daemon(), I think aiming for shorter >> filenames & function names helps, e.g. these long names widen diffstats, >> and many people who hack on the code stick religiously to 80 character >> width terminals. >> > > I prefer self-documenting code. Sure, I'm not saying daemonize() is an ideal name, just suggesting that you can both get uniqueness & self-documentation and not need to split to multiple lines in some common cases to stay within the "We try to keep to at most 80 characters per line" in CodingGuidelines in this series.
Hi Ævar, On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote: > > On Tue, Jul 13 2021, Jeff Hostetler wrote: > > > On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote: > >> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote: > >> > >>> From: Jeff Hostetler <jeffhost@microsoft.com> > >>> > >>> Teach the win32 backend to register a watch on the working tree > >>> root directory (recursively). Also watch the <gitdir> if it is > >>> not inside the working tree. And to collect path change notifications > >>> into batches and publish. > >>> > >>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> > >>> --- > >>> compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++ > >> <bikeshed mode> Spying on the early history of this (looking for the > >> Linux backend) I saw that at some point we had just > >> compat/fsmonitor/linux.c, and presumably some of > >> compat/fsmonitor/{windows,win32,macos,darwin}.c. > >> At some point those filenames became much much longer. > >> > > > > Once upon a time having "foo/bar/win32.c" and "abc/def/win32.c" > > would cause confusion in the debugger (I've long since forgotten > > which). Breaking at win32.c:30 was no longer unique. > > > > Also, if the Makefile sends all .o's to the root directory or a > > unified OBJS directory rather than to the subdir containing the .c, > > then we have another issue during linking... > > > > So, having been burned too many times, I prefer to make source > > filenames unique when possible. > > A much shorter name like compat/fsmonitor/fsmon-win32.c would achieve > that goal. > > >> I've noticed you tend to prefer really long file and function names, > >> e.g. your borrowed daemonize() became > >> spawn_background_fsmonitor_daemon(), I think aiming for shorter > >> filenames & function names helps, e.g. these long names widen diffstats, > >> and many people who hack on the code stick religiously to 80 character > >> width terminals. > >> > > > > I prefer self-documenting code. > > Sure, I'm not saying daemonize() is an ideal name, just suggesting that > you can both get uniqueness & self-documentation and not need to split > to multiple lines in some common cases to stay within the "We try to > keep to at most 80 characters per line" in CodingGuidelines in this > series. While you are entitled to have your taste, I have to point out that Jeff is just as entitled to their taste, and I don't think that you can claim that yours is better. So I wonder what the intended outcome of this review is? To make the patch better? Or to pit taste against taste? Ciao, Johannes
On Fri, Jul 16 2021, Johannes Schindelin wrote: > Hi Ævar, > > On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote: > >> >> On Tue, Jul 13 2021, Jeff Hostetler wrote: >> >> > On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote: >> >> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote: >> >> >> >>> From: Jeff Hostetler <jeffhost@microsoft.com> >> >>> >> >>> Teach the win32 backend to register a watch on the working tree >> >>> root directory (recursively). Also watch the <gitdir> if it is >> >>> not inside the working tree. And to collect path change notifications >> >>> into batches and publish. >> >>> >> >>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> >> >>> --- >> >>> compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++ >> >> <bikeshed mode> Spying on the early history of this (looking for the >> >> Linux backend) I saw that at some point we had just >> >> compat/fsmonitor/linux.c, and presumably some of >> >> compat/fsmonitor/{windows,win32,macos,darwin}.c. >> >> At some point those filenames became much much longer. >> >> >> > >> > Once upon a time having "foo/bar/win32.c" and "abc/def/win32.c" >> > would cause confusion in the debugger (I've long since forgotten >> > which). Breaking at win32.c:30 was no longer unique. >> > >> > Also, if the Makefile sends all .o's to the root directory or a >> > unified OBJS directory rather than to the subdir containing the .c, >> > then we have another issue during linking... >> > >> > So, having been burned too many times, I prefer to make source >> > filenames unique when possible. >> >> A much shorter name like compat/fsmonitor/fsmon-win32.c would achieve >> that goal. >> >> >> I've noticed you tend to prefer really long file and function names, >> >> e.g. your borrowed daemonize() became >> >> spawn_background_fsmonitor_daemon(), I think aiming for shorter >> >> filenames & function names helps, e.g. these long names widen diffstats, >> >> and many people who hack on the code stick religiously to 80 character >> >> width terminals. >> >> >> > >> > I prefer self-documenting code. >> >> Sure, I'm not saying daemonize() is an ideal name, just suggesting that >> you can both get uniqueness & self-documentation and not need to split >> to multiple lines in some common cases to stay within the "We try to >> keep to at most 80 characters per line" in CodingGuidelines in this >> series. > > While you are entitled to have your taste, I have to point out that Jeff > is just as entitled to their taste, and I don't think that you can claim > that yours is better. > > So I wonder what the intended outcome of this review is? To make the patch > better? Or to pit taste against taste? Neither, to address a misunderstanding. Sure, if a reviewer points out "maybe change X to Y" and the reply is "I like X better than Y", fair enough. My reading of Jeff H.'s upthread was that he'd misunderstood my suggesting of that Y for a Z. I.e. that shortening a name like fsmonitor-fs-listen-win32.c (X) necessarily had to mean that we'd have a win32.c (Z), negatively impacting some debugging workflows, as opposed to just a shorter-but-unique name like fsmon-win32.c (Y). Ditto for daemonize() (X/Z) and spawn_background_fsmonitor_daemon() (X). I'm certain that with this reply we're thoroughly into the "respectfully disagree" territory as opposed to having a misunderstanding. I also take and agree your implied point that there's no point in having a yes/no/yes/no/yes argument on-list, and I did not mean to engage in such a thing, only to clear up the misunderstanding, if any. I'll only say that I don't think that something like long variable/file etc. names is *just* a matter of taste, seeing as we have a fairly strict "keep to at most 80 characters per line" as the 2nd item in the C coding style (after "use tabs, not spaces"). That matter of taste for one developer objectively makes it harder to stay within the bounds of the coding style for furute maintenance. We do have active contributors that I understand actually use terminals of that size to work on this project (CC'd, but maybe I misrecall that for one/both). I'm not one of those people, but I do find that maintaining code with needlessly long identifiers in this codebase is painful. E.g. in a patch I just submitted I've been working on similarly long identifiers in the refs code[1], and with say a long variable/type name and using a long-named function you get to the point of needing to place each individual argument of the function on its own line, or near enough to that. 1. https://lore.kernel.org/git/patch-7.7-cb32b5c0526-20210716T142032Z-avarab@gmail.com/
Johannes Schindelin wrote: > On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote: > > On Tue, Jul 13 2021, Jeff Hostetler wrote: > > > I prefer self-documenting code. > > > > Sure, I'm not saying daemonize() is an ideal name, just suggesting that > > you can both get uniqueness & self-documentation and not need to split > > to multiple lines in some common cases to stay within the "We try to > > keep to at most 80 characters per line" in CodingGuidelines in this > > series. > > While you are entitled to have your taste, I have to point out that Jeff > is just as entitled to their taste, and I don't think that you can claim > that yours is better. > > So I wonder what the intended outcome of this review is? To make the patch > better? Or to pit taste against taste? Unless you read minds you can't possibly know what the taste of other people will be. So you put forward what *you* think is better, and then find out if others agree with your taste or not. If it turns out you are the only one that thinks it's better, so be it. For what it's worth, I agree with Ævar that daemonize() is better and I find your statement "I prefer self-documenting code" a) not an argument, b) not a valid argument if we fill in the dots, and c) passively aggressive. Each one of us can only do one thing: express our opinion. What else can we do? Reviewers should not be chastised for expressing their opinion.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > On Fri, Jul 16 2021, Johannes Schindelin wrote: > > Hi Ævar, > > > > On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote: > > > >> > >> On Tue, Jul 13 2021, Jeff Hostetler wrote: > >> > >> > On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote: > >> >> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote: > >> >> > >> >>> From: Jeff Hostetler <jeffhost@microsoft.com> > >> >>> > >> >>> Teach the win32 backend to register a watch on the working tree > >> >>> root directory (recursively). Also watch the <gitdir> if it is > >> >>> not inside the working tree. And to collect path change notifications > >> >>> into batches and publish. > >> >>> > >> >>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> > >> >>> --- > >> >>> compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++ > >> >> <bikeshed mode> Spying on the early history of this (looking for the > >> >> Linux backend) I saw that at some point we had just > >> >> compat/fsmonitor/linux.c, and presumably some of > >> >> compat/fsmonitor/{windows,win32,macos,darwin}.c. > >> >> At some point those filenames became much much longer. > >> >> > >> > > >> > Once upon a time having "foo/bar/win32.c" and "abc/def/win32.c" > >> > would cause confusion in the debugger (I've long since forgotten > >> > which). Breaking at win32.c:30 was no longer unique. > >> > > >> > Also, if the Makefile sends all .o's to the root directory or a > >> > unified OBJS directory rather than to the subdir containing the .c, > >> > then we have another issue during linking... > >> > > >> > So, having been burned too many times, I prefer to make source > >> > filenames unique when possible. > >> > >> A much shorter name like compat/fsmonitor/fsmon-win32.c would achieve > >> that goal. > >> > >> >> I've noticed you tend to prefer really long file and function names, > >> >> e.g. your borrowed daemonize() became > >> >> spawn_background_fsmonitor_daemon(), I think aiming for shorter > >> >> filenames & function names helps, e.g. these long names widen diffstats, > >> >> and many people who hack on the code stick religiously to 80 character > >> >> width terminals. At least "daemon"/"daemonize" already implies "background"; so even if we have the extra function, "spawn_fsmon_daemon()" would be enough info. > >> >> > >> > > >> > I prefer self-documenting code. > >> > >> Sure, I'm not saying daemonize() is an ideal name, just suggesting that > >> you can both get uniqueness & self-documentation and not need to split > >> to multiple lines in some common cases to stay within the "We try to > >> keep to at most 80 characters per line" in CodingGuidelines in this > >> series. > > > > While you are entitled to have your taste, I have to point out that Jeff > > is just as entitled to their taste, and I don't think that you can claim > > that yours is better. > > > > So I wonder what the intended outcome of this review is? To make the patch > > better? Or to pit taste against taste? > > Neither, to address a misunderstanding. > > Sure, if a reviewer points out "maybe change X to Y" and the reply is "I > like X better than Y", fair enough. > > My reading of Jeff H.'s upthread was that he'd misunderstood my > suggesting of that Y for a Z. > > I.e. that shortening a name like fsmonitor-fs-listen-win32.c (X) > necessarily had to mean that we'd have a win32.c (Z), negatively > impacting some debugging workflows, as opposed to just a > shorter-but-unique name like fsmon-win32.c (Y). Short-as-possible-while-being-meaningful is a pretty important usability thing git. There's a good reason git supports OID prefix abbreviations, after all. Not my area of expertise, but AFAIK git's rename detection is affected by basename; and I've encountered debugger confusion with non-unique basenames while debugging other codebases. My brain works like a naive "strcmp"/"memcmp": long common prefixes slows down my ability to differentiate filenames. Having lots of common terms/prefixes on the screen works like camoflage to me and slows down my ability to process things. I suppose my eyes and cognitive abilities are below average; and even worse due to the pandemic numbing my brain. > Ditto for daemonize() (X/Z) and spawn_background_fsmonitor_daemon() (X). (what I said above) > I'm certain that with this reply we're thoroughly into the "respectfully > disagree" territory as opposed to having a misunderstanding. > > I also take and agree your implied point that there's no point in having > a yes/no/yes/no/yes argument on-list, and I did not mean to engage in > such a thing, only to clear up the misunderstanding, if any. > > I'll only say that I don't think that something like long variable/file > etc. names is *just* a matter of taste, seeing as we have a fairly > strict "keep to at most 80 characters per line" as the 2nd item in the C > coding style (after "use tabs, not spaces"). > > That matter of taste for one developer objectively makes it harder to > stay within the bounds of the coding style for furute maintenance. > > We do have active contributors that I understand actually use terminals > of that size to work on this project (CC'd, but maybe I misrecall that > for one/both). I'm not one of those people, but I do find that > maintaining code with needlessly long identifiers in this codebase is > painful. Thanks for Cc-ing me. Yes, I'm one of those developers. Accessibility matters to me: my eyesight certainly isn't getting better with age (nor do I expect anyone elses'). I need giant fonts to reduce eye and neck strain. Fwiw, newspaper publishers figured out line width decades/centuries ago and wrap lines despite having large sheets to work on. I mostly work over mosh or ssh to reduce noise and heat locally. There's no bandwidth for VNC or similar, and graphical stuff tends to be unstable UI-wise anyways so I stick to the terminal. Taste does have much to do with it: I favor stable, reliable tools (e.g. POSIX, Perl5, git) that works well on both old and new hardware. I avoid mainstream "desktop" software since they tend to have unstable UIs which break users' workflows while requiring more powerful HW. Complex graphics drivers tend to get unreliable, too, especially when one is stuck with old HW that gets limited support from vendors. It's also difficult to fix complex drivers as a hobbyist given the one-off HW/vendor-specific knowledge required. So we shouldn't expect a developer with old HW can have more than a standard text terminal. This is an accessibility problem for developers lacking in finances. This is also a problem for developers wishing to backdoors+bugs found in modern systems (IntelME, AMD-PSP, endless stream of CPU bugs). Back to health-related accessibility; I've also had joint problems for many years so shorter identifiers helps reduce typing I need to do. I mostly had that under control pre-pandemic, but it's been a huge struggle to find adequate replacements for activities I used to rely on to manage the pain.
On 7/17/21 8:45 AM, Eric Wong wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: >> On Fri, Jul 16 2021, Johannes Schindelin wrote: >>> Hi Ævar, >>> >>> On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote: >>> >>>> >>>> On Tue, Jul 13 2021, Jeff Hostetler wrote: >>>> >>>>> On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote: >>>>>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote: >>>>>> >>>>>>> From: Jeff Hostetler <jeffhost@microsoft.com> >>>>>>> ... Eric, welcome to the conversation and thanks for sharing your concerns. For my upcoming V4 I've shortened the filenames of the various backends, renamed the -macos one to -darwin, and shortened the names of the fsm-listener API, and the names of those static functions associated with starting the daemon in the background. I think this covers all of the issues raised across several patches in the series. Jeff
diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c index 880446b49e3..d707d47a0d7 100644 --- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c +++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c @@ -2,20 +2,550 @@ #include "config.h" #include "fsmonitor.h" #include "fsmonitor-fs-listen.h" +#include "fsmonitor--daemon.h" + +/* + * The documentation of ReadDirectoryChangesW() states that the maximum + * buffer size is 64K when the monitored directory is remote. + * + * Larger buffers may be used when the monitored directory is local and + * will help us receive events faster from the kernel and avoid dropped + * events. + * + * So we try to use a very large buffer and silently fallback to 64K if + * we get an error. + */ +#define MAX_RDCW_BUF_FALLBACK (65536) +#define MAX_RDCW_BUF (65536 * 8) + +struct one_watch +{ + char buffer[MAX_RDCW_BUF]; + DWORD buf_len; + DWORD count; + + struct strbuf path; + HANDLE hDir; + HANDLE hEvent; + OVERLAPPED overlapped; + + /* + * Is there an active ReadDirectoryChangesW() call pending. If so, we + * need to later call GetOverlappedResult() and possibly CancelIoEx(). + */ + BOOL is_active; +}; + +struct fsmonitor_daemon_backend_data +{ + struct one_watch *watch_worktree; + struct one_watch *watch_gitdir; + + HANDLE hEventShutdown; + + HANDLE hListener[3]; /* we don't own these handles */ +#define LISTENER_SHUTDOWN 0 +#define LISTENER_HAVE_DATA_WORKTREE 1 +#define LISTENER_HAVE_DATA_GITDIR 2 + int nr_listener_handles; +}; + +/* + * Convert the WCHAR path from the notification into UTF8 and + * then normalize it. + */ +static int normalize_path_in_utf8(FILE_NOTIFY_INFORMATION *info, + struct strbuf *normalized_path) +{ + int reserve; + int len = 0; + + strbuf_reset(normalized_path); + if (!info->FileNameLength) + goto normalize; + + /* + * Pre-reserve enough space in the UTF8 buffer for + * each Unicode WCHAR character to be mapped into a + * sequence of 2 UTF8 characters. That should let us + * avoid ERROR_INSUFFICIENT_BUFFER 99.9+% of the time. + */ + reserve = info->FileNameLength + 1; + strbuf_grow(normalized_path, reserve); + + for (;;) { + len = WideCharToMultiByte(CP_UTF8, 0, info->FileName, + info->FileNameLength / sizeof(WCHAR), + normalized_path->buf, + strbuf_avail(normalized_path) - 1, + NULL, NULL); + if (len > 0) + goto normalize; + if (GetLastError() != ERROR_INSUFFICIENT_BUFFER) { + error("[GLE %ld] could not convert path to UTF-8: '%.*ls'", + GetLastError(), + (int)(info->FileNameLength / sizeof(WCHAR)), + info->FileName); + return -1; + } + + strbuf_grow(normalized_path, + strbuf_avail(normalized_path) + reserve); + } + +normalize: + strbuf_setlen(normalized_path, len); + return strbuf_normalize_path(normalized_path); +} void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state) { + SetEvent(state->backend_data->hListener[LISTENER_SHUTDOWN]); +} + +static struct one_watch *create_watch(struct fsmonitor_daemon_state *state, + const char *path) +{ + struct one_watch *watch = NULL; + DWORD desired_access = FILE_LIST_DIRECTORY; + DWORD share_mode = + FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE; + HANDLE hDir; + + hDir = CreateFileA(path, + desired_access, share_mode, NULL, OPEN_EXISTING, + FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED, + NULL); + if (hDir == INVALID_HANDLE_VALUE) { + error(_("[GLE %ld] could not watch '%s'"), + GetLastError(), path); + return NULL; + } + + CALLOC_ARRAY(watch, 1); + + watch->buf_len = sizeof(watch->buffer); /* assume full MAX_RDCW_BUF */ + + strbuf_init(&watch->path, 0); + strbuf_addstr(&watch->path, path); + + watch->hDir = hDir; + watch->hEvent = CreateEvent(NULL, TRUE, FALSE, NULL); + + return watch; +} + +static void destroy_watch(struct one_watch *watch) +{ + if (!watch) + return; + + strbuf_release(&watch->path); + if (watch->hDir != INVALID_HANDLE_VALUE) + CloseHandle(watch->hDir); + if (watch->hEvent != INVALID_HANDLE_VALUE) + CloseHandle(watch->hEvent); + + free(watch); +} + +static int start_rdcw_watch(struct fsmonitor_daemon_backend_data *data, + struct one_watch *watch) +{ + DWORD dwNotifyFilter = + FILE_NOTIFY_CHANGE_FILE_NAME | + FILE_NOTIFY_CHANGE_DIR_NAME | + FILE_NOTIFY_CHANGE_ATTRIBUTES | + FILE_NOTIFY_CHANGE_SIZE | + FILE_NOTIFY_CHANGE_LAST_WRITE | + FILE_NOTIFY_CHANGE_CREATION; + + ResetEvent(watch->hEvent); + + memset(&watch->overlapped, 0, sizeof(watch->overlapped)); + watch->overlapped.hEvent = watch->hEvent; + +start_watch: + /* + * Queue an async call using Overlapped IO. This returns immediately. + * Our event handle will be signalled when the real result is available. + * + * The return value here just means that we successfully queued it. + * We won't know if the Read...() actually produces data until later. + */ + watch->is_active = ReadDirectoryChangesW( + watch->hDir, watch->buffer, watch->buf_len, TRUE, + dwNotifyFilter, &watch->count, &watch->overlapped, NULL); + + /* + * The kernel throws an invalid parameter error when our buffer + * is too big and we are pointed at a remote directory (and possibly + * for other reasons). Quietly set it down and try again. + * + * See note about MAX_RDCW_BUF at the top. + */ + if (!watch->is_active && + GetLastError() == ERROR_INVALID_PARAMETER && + watch->buf_len > MAX_RDCW_BUF_FALLBACK) { + watch->buf_len = MAX_RDCW_BUF_FALLBACK; + goto start_watch; + } + + if (watch->is_active) + return 0; + + error("ReadDirectoryChangedW failed on '%s' [GLE %ld]", + watch->path.buf, GetLastError()); + return -1; +} + +static int recv_rdcw_watch(struct one_watch *watch) +{ + watch->is_active = FALSE; + + /* + * The overlapped result is ready. If the Read...() was successful + * we finally receive the actual result into our buffer. + */ + if (GetOverlappedResult(watch->hDir, &watch->overlapped, &watch->count, + TRUE)) + return 0; + + /* + * NEEDSWORK: If an external <gitdir> is deleted, the above + * returns an error. I'm not sure that there's anything that + * we can do here other than failing -- the <worktree>/.git + * link file would be broken anyway. We might try to check + * for that and return a better error message, but I'm not + * sure it is worth it. + */ + + error("GetOverlappedResult failed on '%s' [GLE %ld]", + watch->path.buf, GetLastError()); + return -1; +} + +static void cancel_rdcw_watch(struct one_watch *watch) +{ + DWORD count; + + if (!watch || !watch->is_active) + return; + + /* + * The calls to ReadDirectoryChangesW() and GetOverlappedResult() + * form a "pair" (my term) where we queue an IO and promise to + * hang around and wait for the kernel to give us the result. + * + * If for some reason after we queue the IO, we have to quit + * or otherwise not stick around for the second half, we must + * tell the kernel to abort the IO. This prevents the kernel + * from writing to our buffer and/or signalling our event + * after we free them. + * + * (Ask me how much fun it was to track that one down). + */ + CancelIoEx(watch->hDir, &watch->overlapped); + GetOverlappedResult(watch->hDir, &watch->overlapped, &count, TRUE); + watch->is_active = FALSE; +} + +/* + * Process filesystem events that happen anywhere (recursively) under the + * <worktree> root directory. For a normal working directory, this includes + * both version controlled files and the contents of the .git/ directory. + * + * If <worktree>/.git is a file, then we only see events for the file + * itself. + */ +static int process_worktree_events(struct fsmonitor_daemon_state *state) +{ + struct fsmonitor_daemon_backend_data *data = state->backend_data; + struct one_watch *watch = data->watch_worktree; + struct strbuf path = STRBUF_INIT; + struct string_list cookie_list = STRING_LIST_INIT_DUP; + struct fsmonitor_batch *batch = NULL; + const char *p = watch->buffer; + + /* + * If the kernel gets more events than will fit in the kernel + * buffer associated with our RDCW handle, it drops them and + * returns a count of zero. + * + * Yes, the call returns WITHOUT error and with length zero. + * + * (The "overflow" case is not ambiguous with the "no data" case + * because we did an INFINITE wait.) + * + * This means we have a gap in coverage. Tell the daemon layer + * to resync. + */ + if (!watch->count) { + trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel", + "overflow"); + fsmonitor_force_resync(state); + return LISTENER_HAVE_DATA_WORKTREE; + } + + /* + * On Windows, `info` contains an "array" of paths that are + * relative to the root of whichever directory handle received + * the event. + */ + for (;;) { + FILE_NOTIFY_INFORMATION *info = (void *)p; + const char *slash; + enum fsmonitor_path_type t; + + strbuf_reset(&path); + if (normalize_path_in_utf8(info, &path) == -1) + goto skip_this_path; + + t = fsmonitor_classify_path_workdir_relative(path.buf); + + switch (t) { + case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX: + /* special case cookie files within .git */ + + /* Use just the filename of the cookie file. */ + slash = find_last_dir_sep(path.buf); + string_list_append(&cookie_list, + slash ? slash + 1 : path.buf); + break; + + case IS_INSIDE_DOT_GIT: + /* ignore everything inside of "<worktree>/.git/" */ + break; + + case IS_DOT_GIT: + /* "<worktree>/.git" was deleted (or renamed away) */ + if ((info->Action == FILE_ACTION_REMOVED) || + (info->Action == FILE_ACTION_RENAMED_OLD_NAME)) { + trace2_data_string("fsmonitor", NULL, + "fsm-listen/dotgit", + "removed"); + goto force_shutdown; + } + break; + + case IS_WORKDIR_PATH: + /* queue normal pathname */ + if (!batch) + batch = fsmonitor_batch__new(); + fsmonitor_batch__add_path(batch, path.buf); + break; + + case IS_GITDIR: + case IS_INSIDE_GITDIR: + case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX: + default: + BUG("unexpected path classification '%d' for '%s'", + t, path.buf); + } + +skip_this_path: + if (!info->NextEntryOffset) + break; + p += info->NextEntryOffset; + } + + fsmonitor_publish(state, batch, &cookie_list); + batch = NULL; + string_list_clear(&cookie_list, 0); + strbuf_release(&path); + return LISTENER_HAVE_DATA_WORKTREE; + +force_shutdown: + fsmonitor_batch__pop(batch); + string_list_clear(&cookie_list, 0); + strbuf_release(&path); + return LISTENER_SHUTDOWN; +} + +/* + * Process filesystem events that happened anywhere (recursively) under the + * external <gitdir> (such as non-primary worktrees or submodules). + * We only care about cookie files that our client threads created here. + * + * Note that we DO NOT get filesystem events on the external <gitdir> + * itself (it is not inside something that we are watching). In particular, + * we do not get an event if the external <gitdir> is deleted. + */ +static int process_gitdir_events(struct fsmonitor_daemon_state *state) +{ + struct fsmonitor_daemon_backend_data *data = state->backend_data; + struct one_watch *watch = data->watch_gitdir; + struct strbuf path = STRBUF_INIT; + struct string_list cookie_list = STRING_LIST_INIT_DUP; + const char *p = watch->buffer; + + if (!watch->count) { + trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel", + "overflow"); + fsmonitor_force_resync(state); + return LISTENER_HAVE_DATA_GITDIR; + } + + for (;;) { + FILE_NOTIFY_INFORMATION *info = (void *)p; + const char *slash; + enum fsmonitor_path_type t; + + strbuf_reset(&path); + if (normalize_path_in_utf8(info, &path) == -1) + goto skip_this_path; + + t = fsmonitor_classify_path_gitdir_relative(path.buf); + + switch (t) { + case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX: + /* special case cookie files within gitdir */ + + /* Use just the filename of the cookie file. */ + slash = find_last_dir_sep(path.buf); + string_list_append(&cookie_list, + slash ? slash + 1 : path.buf); + break; + + case IS_INSIDE_GITDIR: + goto skip_this_path; + + default: + BUG("unexpected path classification '%d' for '%s'", + t, path.buf); + } + +skip_this_path: + if (!info->NextEntryOffset) + break; + p += info->NextEntryOffset; + } + + fsmonitor_publish(state, NULL, &cookie_list); + string_list_clear(&cookie_list, 0); + strbuf_release(&path); + return LISTENER_HAVE_DATA_GITDIR; } void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state) { + struct fsmonitor_daemon_backend_data *data = state->backend_data; + DWORD dwWait; + + state->error_code = 0; + + if (start_rdcw_watch(data, data->watch_worktree) == -1) + goto force_error_stop; + + if (data->watch_gitdir && + start_rdcw_watch(data, data->watch_gitdir) == -1) + goto force_error_stop; + + for (;;) { + dwWait = WaitForMultipleObjects(data->nr_listener_handles, + data->hListener, + FALSE, INFINITE); + + if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_WORKTREE) { + if (recv_rdcw_watch(data->watch_worktree) == -1) + goto force_error_stop; + if (process_worktree_events(state) == LISTENER_SHUTDOWN) + goto force_shutdown; + if (start_rdcw_watch(data, data->watch_worktree) == -1) + goto force_error_stop; + continue; + } + + if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_GITDIR) { + if (recv_rdcw_watch(data->watch_gitdir) == -1) + goto force_error_stop; + if (process_gitdir_events(state) == LISTENER_SHUTDOWN) + goto force_shutdown; + if (start_rdcw_watch(data, data->watch_gitdir) == -1) + goto force_error_stop; + continue; + } + + if (dwWait == WAIT_OBJECT_0 + LISTENER_SHUTDOWN) + goto clean_shutdown; + + error(_("could not read directory changes [GLE %ld]"), + GetLastError()); + goto force_error_stop; + } + +force_error_stop: + state->error_code = -1; + +force_shutdown: + /* + * Tell the IPC thead pool to stop (which completes the await + * in the main thread (which will also signal this thread (if + * we are still alive))). + */ + ipc_server_stop_async(state->ipc_server_data); + +clean_shutdown: + cancel_rdcw_watch(data->watch_worktree); + cancel_rdcw_watch(data->watch_gitdir); } int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state) { + struct fsmonitor_daemon_backend_data *data; + + CALLOC_ARRAY(data, 1); + + data->hEventShutdown = CreateEvent(NULL, TRUE, FALSE, NULL); + + data->watch_worktree = create_watch(state, + state->path_worktree_watch.buf); + if (!data->watch_worktree) + goto failed; + + if (state->nr_paths_watching > 1) { + data->watch_gitdir = create_watch(state, + state->path_gitdir_watch.buf); + if (!data->watch_gitdir) + goto failed; + } + + data->hListener[LISTENER_SHUTDOWN] = data->hEventShutdown; + data->nr_listener_handles++; + + data->hListener[LISTENER_HAVE_DATA_WORKTREE] = + data->watch_worktree->hEvent; + data->nr_listener_handles++; + + if (data->watch_gitdir) { + data->hListener[LISTENER_HAVE_DATA_GITDIR] = + data->watch_gitdir->hEvent; + data->nr_listener_handles++; + } + + state->backend_data = data; + return 0; + +failed: + CloseHandle(data->hEventShutdown); + destroy_watch(data->watch_worktree); + destroy_watch(data->watch_gitdir); + return -1; } void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state) { + struct fsmonitor_daemon_backend_data *data; + + if (!state || !state->backend_data) + return; + + data = state->backend_data; + + CloseHandle(data->hEventShutdown); + destroy_watch(data->watch_worktree); + destroy_watch(data->watch_gitdir); + + FREE_AND_NULL(state->backend_data); }