Message ID | pull.1143.v2.git.1646777727.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Builtin FSMonitor Part 3 | expand |
On 3/8/2022 5:15 PM, Jeff Hostetler via GitGitGadget wrote: > Here is V2 of part 3 of my builtin FSMonitor series. > > I think I have addressed all of the feedback from V1. This includes: > Range-diff vs v1: > -: ----------- > 22: 524d449ed64 fsmonitor: never set CE_FSMONITOR_VALID on submodules > -: ----------- > 24: 95b9d4210d2 fsmonitor: on macOS also emit NFC spelling for NFD pathname > -: ----------- > 25: 5a0c1b7a287 t/lib-unicode-nfc-nfd: helper prereqs for testing unicode nfc/nfd > -: ----------- > 26: a45c1fd3000 t7527: test Unicode NFC/NFD handling on MacOS > -: ----------- > 27: e3e01677d93 fsmonitor-settings: NTFS and FAT32 on MacOS are incompatible I looked closely through the range-diff for the edits, then looked at these new patches closely. Outside of one thought about some debug output, I'm happy with this version. Thanks, -Stolee
On Tue, Mar 08, 2022 at 10:15:00PM +0000, Jeff Hostetler via GitGitGadget wrote:
> Here is V2 of part 3 of my builtin FSMonitor series.
Hej Jeff,
First of all, the new test case passes here on one machine.
I will do another round on another machine the next days.
Having said that, there are some small questions here and there,
I'll send them as soon as I find the time for a more proper review.
Hej Jeff, I tried your patch on both a newer Mac and an older machine (with HFS+) The older machine doesn't have kFSEventStreamEventFlagItemCloned As it is an enum, and not a #define, I ended up here: diff --git a/compat/fsmonitor/fsm-listen-darwin.c b/compat/fsmonitor/fsm-listen-darwin.c index 3332d3b779..fa172a05c4 100644 --- a/compat/fsmonitor/fsm-listen-darwin.c +++ b/compat/fsmonitor/fsm-listen-darwin.c @@ -169,8 +169,6 @@ static void log_flags_set(const char *path, const FSEventStreamEventFlags flag) strbuf_addstr(&msg, "ItemXattrMod|"); if (flag & kFSEventStreamEventFlagOwnEvent) strbuf_addstr(&msg, "OwnEvent|"); - if (flag & kFSEventStreamEventFlagItemCloned) - strbuf_addstr(&msg, "ItemCloned|"); trace_printf_key(&trace_fsmonitor, "fsevent: '%s', flags=0x%x %s", path, flag, msg.buf); @@ -221,8 +219,7 @@ static int ef_ignore_xattr(const FSEventStreamEventFlags ef) kFSEventStreamEventFlagItemModified | kFSEventStreamEventFlagItemRemoved | kFSEventStreamEventFlagItemRenamed | - kFSEventStreamEventFlagItemXattrMod | - kFSEventStreamEventFlagItemCloned; + kFSEventStreamEventFlagItemXattrMod ; One other thing, I just add it here: There is a new file, t/lib-unicode-nfc-nfd.sh, which helps us with this code: test_lazy_prereq UNICODE_NFC_PRESERVED The existing code uses a construct called UTF8_NFD_TO_NFC And now I have 2 questions: - Do we need the UNICODE_NFC_PRESERVED at all ? - And should the UTF8_NFD_TO_NFC better be called UTF8_NFC_TO_NFD, because that is what it checks. - Do we need the UNICODE_NFD_PRESERVED at all ? As there are no non-UNICODE_NFD_PRESERVED filesystems, as far as I know. And the current code does no tests, just debug prints. I dunno. On Tue, Mar 08, 2022 at 10:15:00PM +0000, Jeff Hostetler via GitGitGadget wrote: > Here is V2 of part 3 of my builtin FSMonitor series. > [] I updated the daemon on MacOS to report both the NFC and NFD spellings of > a pathname when appropriate. This is a little more general than the > "core.precomposeUnicode" setting, since the daemon does not know how the > client has (or will have) it set when they make a query. > > [] I replaced my Unicode NFC/NFD test for MacOS to focus exclusively on > Unicode composition/decomposition sensitivity and to not confuse that with > case sensitivity. That is a good thing. [snip]
On 3/13/22 6:42 AM, Torsten Bögershausen wrote: > Hej Jeff, > > I tried your patch on both a newer Mac and an older machine (with HFS+) > The older machine doesn't have > kFSEventStreamEventFlagItemCloned > As it is an enum, and not a #define, I ended up here: > > diff --git a/compat/fsmonitor/fsm-listen-darwin.c b/compat/fsmonitor/fsm-listen-darwin.c > index 3332d3b779..fa172a05c4 100644 > --- a/compat/fsmonitor/fsm-listen-darwin.c > +++ b/compat/fsmonitor/fsm-listen-darwin.c > @@ -169,8 +169,6 @@ static void log_flags_set(const char *path, const FSEventStreamEventFlags flag) > strbuf_addstr(&msg, "ItemXattrMod|"); > if (flag & kFSEventStreamEventFlagOwnEvent) > strbuf_addstr(&msg, "OwnEvent|"); > - if (flag & kFSEventStreamEventFlagItemCloned) > - strbuf_addstr(&msg, "ItemCloned|"); > > trace_printf_key(&trace_fsmonitor, "fsevent: '%s', flags=0x%x %s", > path, flag, msg.buf); > @@ -221,8 +219,7 @@ static int ef_ignore_xattr(const FSEventStreamEventFlags ef) > kFSEventStreamEventFlagItemModified | > kFSEventStreamEventFlagItemRemoved | > kFSEventStreamEventFlagItemRenamed | > - kFSEventStreamEventFlagItemXattrMod | > - kFSEventStreamEventFlagItemCloned; > + kFSEventStreamEventFlagItemXattrMod ; It looks like the ...Cloned bit was added to the SDK in 10.13 [1]. All the other bits were defined sometime between 10.5 and 10.10. I'll add something in V7 to guard that bit. I think 10.10 is old enough that we don't need to special case those bits too. Thanks, Jeff [1] /Applications/Xcode.app/Contents/Developer/Platforms/ \ MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/ \ Library/Frameworks/CoreServices.framework/Frameworks/ \ FSEvents.framework/Versions/Current/Headers/FSEvents.h
On 3/13/22 6:42 AM, Torsten Bögershausen wrote: > Hej Jeff, > [...] > > One other thing, I just add it here: > There is a new file, t/lib-unicode-nfc-nfd.sh, which helps us with this code: > test_lazy_prereq UNICODE_NFC_PRESERVED > > The existing code uses a construct called > UTF8_NFD_TO_NFC > > And now I have 2 questions: > - Do we need the UNICODE_NFC_PRESERVED at all ? > - And should the UTF8_NFD_TO_NFC better be called UTF8_NFC_TO_NFD, > because that is what it checks. > - Do we need the UNICODE_NFD_PRESERVED at all ? > > As there are no non-UNICODE_NFD_PRESERVED filesystems, as far as I know. > And the current code does no tests, just debug prints. > I dunno. I created t/lib-unicode-nfc-nfd.sh to help me understand the issues. I found the existing UTF8_NFD_TO_NFC prereq confusing (and yes it seemed poorly named). The existing prereq returned the same answer on APFS, HFS+, and FAT32 (a thumbdrive). I know they behave differently and I found it odd that the prereq did not make any distinction. I was hesitant to rename the existing prereq because it is currently used by 5+ different tests and I didn't want to expand the scope of my two already very large series. Also, the existing prereq feels a little sloppy. It creates a file in NFC and does a lstat in the NFD spelling. There are several ways that the OS and/or FS can lie to us. For example, the prereq is satisfied on a FAT32 thumbdrive and we know FAT32 doesn't do NFC-->NFD conversions. So I'd like to move away from that prereq definition at some point. My new prereqs try to: (1) independently confirm whether there is aliasing happening at all (whether at the FS or OS layer). (2) determine if the actual on-disk spelling is altered by the FS (in both NFC and NFD cases). We know that HFS+ does not preserve NFC spellings, but APFS does. (FAT32 also preserves NFC spelling under MacOS.) So the UNICODE_NFC_PRESERVED lets me distinguish between HFS+ and APFS/FAT32. I have not heard of any filesystems that convert NFD to NFC, so technically we don't need the UNICODE_NFD_PRESERVED prereq, but then again until I tested that, it was unclear how MacOS did the aliasing on APFS (and FAT32). On the basis of that testing, we can say that MacOS -- at the MacOS layer -- is responsible for the aliasing and that both NFC and NFD spellings are preserved on APFS and FAT32. So I'd rather keep the 3 prereqs that I have now. The ones marked _DOUBLE_ are currently extra. I have them to help study how code points with multiple combining characters are handled. I have prereqs for the basic double chars, but there are several opportunities for weird edge cases (non- canonical ordering and other collisions) that I don't want to get stuck on right now. So we might make more use of them in the future. That's too long of an answer, but hopefully that explains some of my paranoia. :-) Jeff > > On Tue, Mar 08, 2022 at 10:15:00PM +0000, Jeff Hostetler via GitGitGadget wrote: >> Here is V2 of part 3 of my builtin FSMonitor series. [...]
On March 21, 2022 6:06 PM, Jeff Hostetler wrote: >On 3/13/22 6:42 AM, Torsten Bögershausen wrote: >> Hej Jeff, >> >> I tried your patch on both a newer Mac and an older machine (with >> HFS+) The older machine doesn't have kFSEventStreamEventFlagItemCloned >> As it is an enum, and not a #define, I ended up here: >> >> diff --git a/compat/fsmonitor/fsm-listen-darwin.c b/compat/fsmonitor/fsm- >listen-darwin.c >> index 3332d3b779..fa172a05c4 100644 >> --- a/compat/fsmonitor/fsm-listen-darwin.c >> +++ b/compat/fsmonitor/fsm-listen-darwin.c >> @@ -169,8 +169,6 @@ static void log_flags_set(const char *path, const >FSEventStreamEventFlags flag) >> strbuf_addstr(&msg, "ItemXattrMod|"); >> if (flag & kFSEventStreamEventFlagOwnEvent) >> strbuf_addstr(&msg, "OwnEvent|"); >> - if (flag & kFSEventStreamEventFlagItemCloned) >> - strbuf_addstr(&msg, "ItemCloned|"); >> >> trace_printf_key(&trace_fsmonitor, "fsevent: '%s', flags=0x%x %s", >> path, flag, msg.buf); >> @@ -221,8 +219,7 @@ static int ef_ignore_xattr(const >FSEventStreamEventFlags ef) >> kFSEventStreamEventFlagItemModified | >> kFSEventStreamEventFlagItemRemoved | >> kFSEventStreamEventFlagItemRenamed | >> - kFSEventStreamEventFlagItemXattrMod | >> - kFSEventStreamEventFlagItemCloned; >> + kFSEventStreamEventFlagItemXattrMod ; > >It looks like the ...Cloned bit was added to the SDK in 10.13 [1]. >All the other bits were defined sometime between 10.5 and 10.10. > >I'll add something in V7 to guard that bit. I think 10.10 is old enough that we don't >need to special case those bits too. I realize it is a bit late in the game, but would you consider a pre-hook and post-hook that automatically run with fsmonitor kicks off/terminates. I am thinking about use cases where this is integrated into more complex processes and it would be nice to have notifications of what fsmonitor is doing and when. Thanks, Randall
On 3/21/22 7:18 PM, rsbecker@nexbridge.com wrote: > On March 21, 2022 6:06 PM, Jeff Hostetler wrote: >> On 3/13/22 6:42 AM, Torsten Bögershausen wrote: >>> Hej Jeff, >>> [...] >> >> It looks like the ...Cloned bit was added to the SDK in 10.13 [1]. >> All the other bits were defined sometime between 10.5 and 10.10. >> >> I'll add something in V7 to guard that bit. I think 10.10 is old enough that we don't >> need to special case those bits too. > > I realize it is a bit late in the game, but would you consider a pre-hook and post-hook that automatically run with fsmonitor kicks off/terminates. I am thinking about use cases where this is integrated into more complex processes and it would be nice to have notifications of what fsmonitor is doing and when. > > Thanks, > Randall > I hadn't really considered having a pre/post hook for the daemon. I'm not opposed to it; I just hadn't thought about it. By this I assume you mean something inside the fsmonitor--daemon process that invokes the hooks when it is starting/stopping. As opposed to something in a client command (like status) before it implicitly started a daemon process. The latter method would not give you post-hook events because the daemon usually outlives the client command. Perhaps you could elaborate on what you would use these hooks for or how they would be helpful. It would be easy to add pre/post hooks in the main thread of the daemon. However, I worry about the prehook slowing the startup of the daemon -- since the client status command might be waiting for it to become ready. I also have a "health" thread in part3 that would be a candidate for pre/post and any other periodic hooks that might be useful. But again, before I suggest a design for this, it would be good to know what kind of things you would want to do with them. Jeff
On March 22, 2022 10:11 AM, Jeff Hostetler wrote: >On 3/21/22 7:18 PM, rsbecker@nexbridge.com wrote: >> On March 21, 2022 6:06 PM, Jeff Hostetler wrote: >>> On 3/13/22 6:42 AM, Torsten Bögershausen wrote: >>>> Hej Jeff, >>>> >[...] >>> >>> It looks like the ...Cloned bit was added to the SDK in 10.13 [1]. >>> All the other bits were defined sometime between 10.5 and 10.10. >>> >>> I'll add something in V7 to guard that bit. I think 10.10 is old >>> enough that we don't need to special case those bits too. >> >> I realize it is a bit late in the game, but would you consider a pre-hook and post- >hook that automatically run with fsmonitor kicks off/terminates. I am thinking >about use cases where this is integrated into more complex processes and it >would be nice to have notifications of what fsmonitor is doing and when. >> >> Thanks, >> Randall >> > >I hadn't really considered having a pre/post hook for the daemon. >I'm not opposed to it; I just hadn't thought about it. > >By this I assume you mean something inside the fsmonitor--daemon process that >invokes the hooks when it is starting/stopping. >As opposed to something in a client command (like status) before it implicitly >started a daemon process. The latter method would not give you post-hook >events because the daemon usually outlives the client command. > >Perhaps you could elaborate on what you would use these hooks for or how they >would be helpful. It would be easy to add pre/post hooks in the main thread of >the daemon. However, I worry about the prehook slowing the startup of the >daemon -- since the client status command might be waiting for it to become >ready. I also have a "health" thread in part3 that would be a candidate for >pre/post and any other periodic hooks that might be useful. >But again, before I suggest a design for this, it would be good to know what kind of >things you would want to do with them. Some examples of what I have in mind. There are more, but this covers what I have in mind urgently: 1. Setting up a lock file (semaphore) just before fsmonitor runs that will cause any scripts that might change the state of the repository on the fly to suspend until fsmonitor is done. 2. Ensuring that in-flight scripts that do stuff are finished or not leaving the repo in a transitional state before fsmonitor runs - holding fsmonitor until the pre-hook finishes. 3. Notifying syslog or some other paging system if something has gone horribly wrong - as in fsmonitor found something bad in the index. 4. Clearing any semaphores created earlier (example 1). Regards, Randall
On 3/22/22 10:25 AM, rsbecker@nexbridge.com wrote: > On March 22, 2022 10:11 AM, Jeff Hostetler wrote: >> On 3/21/22 7:18 PM, rsbecker@nexbridge.com wrote: >>> On March 21, 2022 6:06 PM, Jeff Hostetler wrote: >>>> On 3/13/22 6:42 AM, Torsten Bögershausen wrote: >>>>> Hej Jeff, >>>>> >> [...] >>>> >>>> It looks like the ...Cloned bit was added to the SDK in 10.13 [1]. >>>> All the other bits were defined sometime between 10.5 and 10.10. >>>> >>>> I'll add something in V7 to guard that bit. I think 10.10 is old >>>> enough that we don't need to special case those bits too. >>> >>> I realize it is a bit late in the game, but would you consider a pre-hook and post- >> hook that automatically run with fsmonitor kicks off/terminates. I am thinking >> about use cases where this is integrated into more complex processes and it >> would be nice to have notifications of what fsmonitor is doing and when. >>> >>> Thanks, >>> Randall >>> >> >> I hadn't really considered having a pre/post hook for the daemon. >> I'm not opposed to it; I just hadn't thought about it. >> >> By this I assume you mean something inside the fsmonitor--daemon process that >> invokes the hooks when it is starting/stopping. >> As opposed to something in a client command (like status) before it implicitly >> started a daemon process. The latter method would not give you post-hook >> events because the daemon usually outlives the client command. >> >> Perhaps you could elaborate on what you would use these hooks for or how they >> would be helpful. It would be easy to add pre/post hooks in the main thread of >> the daemon. However, I worry about the prehook slowing the startup of the >> daemon -- since the client status command might be waiting for it to become >> ready. I also have a "health" thread in part3 that would be a candidate for >> pre/post and any other periodic hooks that might be useful. >> But again, before I suggest a design for this, it would be good to know what kind of >> things you would want to do with them. > > Some examples of what I have in mind. There are more, but this covers what I have in mind urgently: > > 1. Setting up a lock file (semaphore) just before fsmonitor runs that will cause any scripts that might change the state of the repository on the fly to suspend until fsmonitor is done. The builtin fsmonitor--daemon is a long-running process. It is either explicitly started by the user or implicitly started by clients commands, like "git status", lazily after it is enabled in the config. It runs until explicitly stopped (or the workdir is deleted). It is designed to be running and watch the filesystem for changes. Later client commands can ask it for what has changed on disk since the previous request (checkpoint). And the only way to capture that info is to watch the file system as things happen. (Unless we have a really deep journal, but that often requires admin access, so we don't use that.) The fsmonitor daemon is unlike other subordinate commands in Git. For example, "git fetch" might synchronously invoke "git index-pack" and communicate over the child's stdin/stdout. And that child is bound to a single parent process. When the daemon starts, it disassociates from the console and opens a socket or named pipe and listens for requests (REST-like) commands. It is designed to respond to multiple clients concurrently and over a long time period -- like a daemon or service process. > 2. Ensuring that in-flight scripts that do stuff are finished or not leaving the repo in a transitional state before fsmonitor runs - holding fsmonitor until the pre-hook finishes. > 3. Notifying syslog or some other paging system if something has gone horribly wrong - as in fsmonitor found something bad in the index. fsmonitor doesn't read the index. it's only watching and summarizing file system events. And can enumerate the list of changed paths between two checkpoints in response to a client request. > 4. Clearing any semaphores created earlier (example 1). The daemon isn't designed to ever "be done" and hence there are no on-disk lock files/semaphores. There might be some opportunities for startup/shutdown to do some cleanup (temp files and the like), but I think it is premature to talk about that right now. Hope this helps, Jeff > > Regards, > Randall >