diff mbox series

[06/12] fsmonitor: clarify handling of directory events in callback

Message ID 5b6f8bd1fe7b6c742b25a5a1ed95b528f352215e.1707857541.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series FSMonitor edge cases on case-insensitive file systems | expand

Commit Message

Jeff Hostetler Feb. 13, 2024, 8:52 p.m. UTC
From: Jeff Hostetler <jeffhostetler@github.com>

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
---
 fsmonitor.c | 47 +++++++++++++++++++++++++++++++++--------------
 1 file changed, 33 insertions(+), 14 deletions(-)

Comments

Junio C Hamano Feb. 14, 2024, 7:47 a.m. UTC | #1
"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Jeff Hostetler <jeffhostetler@github.com>
>
> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
> ---
>  fsmonitor.c | 47 +++++++++++++++++++++++++++++++++--------------
>  1 file changed, 33 insertions(+), 14 deletions(-)
>
> diff --git a/fsmonitor.c b/fsmonitor.c
> index 614270fa5e8..754fe20cfd0 100644
> --- a/fsmonitor.c
> +++ b/fsmonitor.c
> @@ -219,24 +219,40 @@ static void fsmonitor_refresh_callback_unqualified(
>  	}
>  }
>  
> -static void fsmonitor_refresh_callback_slash(
> +/*
> + * The daemon can decorate directory events, such as a move or rename,
> + * by adding a trailing slash to the observed name.  Use this to
> + * explicitly invalidate the entire cone under that directory.
> + *
> + * The daemon can only reliably do that if the OS FSEvent contains
> + * sufficient information in the event.
> + *
> + * macOS FSEvents have enough information.
> + *
> + * Other platforms may or may not be able to do it (and it might
> + * depend on the type of event (for example, a daemon could lstat() an
> + * observed pathname after a rename, but not after a delete)).
> + *
> + * If we find an exact match in the index for a path with a trailing
> + * slash, it means that we matched a sparse-index directory in a
> + * cone-mode sparse-checkout (since that's the only time we have
> + * directories in the index).  We should never see this in practice
> + * (because sparse directories should not be present and therefore
> + * not generating FS events).  Either way, we can treat them in the
> + * same way and just invalidate the cache-entry and the untracked
> + * cache (and in this case, the forward cache-entry scan won't find
> + * anything and it doesn't hurt to let it run).
> + *
> + * Return the number of cache-entries that we invalidated.  We will
> + * use this later to determine if we need to attempt a second
> + * case-insensitive search.
> + */
> +static int fsmonitor_refresh_callback_slash(
>  	struct index_state *istate, const char *name, int len, int pos)
>  {

This was split out a few patches ago, and the caller of course
ignored the return value (void), but now it turns an integer, and
this change is without a corresponding update to the caller, which
leaves readers puzzled.

Perhaps a future patch either updates the existing caller or adds a
new caller that utilize the returned value, but then at least the
proposed commit message for this step should hint how it helps the
caller(s) we will see in the future steps if this function returns
the number of entries invalidated, iow, how the caller is expected
to use the returned value from here, no?

Alternatively, this step can limit itself to what the commit title
claims to do---to clarify what the helper does with enhanced in-code
comments.  Then a future step that updates the caller to care about
the return value can have both the changes to this callee as well as
the caller---which may make it easier to see how the returned info
helps the caller.  I dunno which is more reasonable.

Thanks.



>  	int i;
> +	int nr_in_cone = 0;
>  
> -	/*
> -	 * The daemon can decorate directory events, such as
> -	 * moves or renames, with a trailing slash if the OS
> -	 * FS Event contains sufficient information, such as
> -	 * MacOS.
> -	 *
> -	 * Use this to invalidate the entire cone under that
> -	 * directory.
> -	 *
> -	 * We do not expect an exact match because the index
> -	 * does not normally contain directory entries, so we
> -	 * start at the insertion point and scan.
> -	 */
>  	if (pos < 0)
>  		pos = -pos - 1;
>  
> @@ -245,7 +261,10 @@ static void fsmonitor_refresh_callback_slash(
>  		if (!starts_with(istate->cache[i]->name, name))
>  			break;
>  		istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
> +		nr_in_cone++;
>  	}
> +
> +	return nr_in_cone;
>  }
>  
>  static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
Patrick Steinhardt Feb. 15, 2024, 9:32 a.m. UTC | #2
On Tue, Feb 13, 2024 at 08:52:15PM +0000, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhostetler@github.com>
> 
> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
> ---
>  fsmonitor.c | 47 +++++++++++++++++++++++++++++++++--------------
>  1 file changed, 33 insertions(+), 14 deletions(-)
> 
> diff --git a/fsmonitor.c b/fsmonitor.c
> index 614270fa5e8..754fe20cfd0 100644
> --- a/fsmonitor.c
> +++ b/fsmonitor.c
> @@ -219,24 +219,40 @@ static void fsmonitor_refresh_callback_unqualified(
>  	}
>  }
>  
> -static void fsmonitor_refresh_callback_slash(
> +/*
> + * The daemon can decorate directory events, such as a move or rename,
> + * by adding a trailing slash to the observed name.  Use this to
> + * explicitly invalidate the entire cone under that directory.
> + *
> + * The daemon can only reliably do that if the OS FSEvent contains
> + * sufficient information in the event.
> + *
> + * macOS FSEvents have enough information.
> + *
> + * Other platforms may or may not be able to do it (and it might
> + * depend on the type of event (for example, a daemon could lstat() an
> + * observed pathname after a rename, but not after a delete)).
> + *
> + * If we find an exact match in the index for a path with a trailing
> + * slash, it means that we matched a sparse-index directory in a
> + * cone-mode sparse-checkout (since that's the only time we have
> + * directories in the index).  We should never see this in practice
> + * (because sparse directories should not be present and therefore
> + * not generating FS events).  Either way, we can treat them in the
> + * same way and just invalidate the cache-entry and the untracked
> + * cache (and in this case, the forward cache-entry scan won't find
> + * anything and it doesn't hurt to let it run).
> + *
> + * Return the number of cache-entries that we invalidated.  We will
> + * use this later to determine if we need to attempt a second
> + * case-insensitive search.
> + */
> +static int fsmonitor_refresh_callback_slash(
>  	struct index_state *istate, const char *name, int len, int pos)
>  {
>  	int i;
> +	int nr_in_cone = 0;

Should we return `size_t` instead of `int`?

Patrick

> -	/*
> -	 * The daemon can decorate directory events, such as
> -	 * moves or renames, with a trailing slash if the OS
> -	 * FS Event contains sufficient information, such as
> -	 * MacOS.
> -	 *
> -	 * Use this to invalidate the entire cone under that
> -	 * directory.
> -	 *
> -	 * We do not expect an exact match because the index
> -	 * does not normally contain directory entries, so we
> -	 * start at the insertion point and scan.
> -	 */
>  	if (pos < 0)
>  		pos = -pos - 1;
>  
> @@ -245,7 +261,10 @@ static void fsmonitor_refresh_callback_slash(
>  		if (!starts_with(istate->cache[i]->name, name))
>  			break;
>  		istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
> +		nr_in_cone++;
>  	}
> +
> +	return nr_in_cone;
>  }
>  
>  static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
> -- 
> gitgitgadget
> 
>
Jeff Hostetler Feb. 20, 2024, 6:56 p.m. UTC | #3
On 2/14/24 2:47 AM, Junio C Hamano wrote:
> "Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Jeff Hostetler <jeffhostetler@github.com>
>>
>> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
>> ---
>>   fsmonitor.c | 47 +++++++++++++++++++++++++++++++++--------------
>>   1 file changed, 33 insertions(+), 14 deletions(-)
>>
>> diff --git a/fsmonitor.c b/fsmonitor.c
>> index 614270fa5e8..754fe20cfd0 100644
>> --- a/fsmonitor.c
>> +++ b/fsmonitor.c
>> @@ -219,24 +219,40 @@ static void fsmonitor_refresh_callback_unqualified(
>>   	}
>>   }
>>   
>> -static void fsmonitor_refresh_callback_slash(
...
>> +static int fsmonitor_refresh_callback_slash(
>>   	struct index_state *istate, const char *name, int len, int pos)
>>   {
> 
> This was split out a few patches ago, and the caller of course
> ignored the return value (void), but now it turns an integer, and
> this change is without a corresponding update to the caller, which
> leaves readers puzzled.
> 
> Perhaps a future patch either updates the existing caller or adds a
> new caller that utilize the returned value, but then at least the
> proposed commit message for this step should hint how it helps the
> caller(s) we will see in the future steps if this function returns
> the number of entries invalidated, iow, how the caller is expected
> to use the returned value from here, no?
> 
> Alternatively, this step can limit itself to what the commit title
> claims to do---to clarify what the helper does with enhanced in-code
> comments.  Then a future step that updates the caller to care about
> the return value can have both the changes to this callee as well as
> the caller---which may make it easier to see how the returned info
> helps the caller.  I dunno which is more reasonable.
> 

I'll split this into 2 commits.  One for the refactor and one for
the new return value.

Jeff
Jeff Hostetler Feb. 20, 2024, 7:10 p.m. UTC | #4
On 2/15/24 4:32 AM, Patrick Steinhardt wrote:
> On Tue, Feb 13, 2024 at 08:52:15PM +0000, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <jeffhostetler@github.com>
>>
>> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
>> ---
>>   fsmonitor.c | 47 +++++++++++++++++++++++++++++++++--------------
>>   1 file changed, 33 insertions(+), 14 deletions(-)
>>
>> diff --git a/fsmonitor.c b/fsmonitor.c
>> index 614270fa5e8..754fe20cfd0 100644
>> --- a/fsmonitor.c
>> +++ b/fsmonitor.c
>> @@ -219,24 +219,40 @@ static void fsmonitor_refresh_callback_unqualified(
...
>> +static int fsmonitor_refresh_callback_slash(
>>   	struct index_state *istate, const char *name, int len, int pos)
>>   {
>>   	int i;
>> +	int nr_in_cone = 0;
> 
> Should we return `size_t` instead of `int`?
> 
> Patrick

yeah, I can fix all of the return values to be 'size_t' since
that is new functionality and not colliding with the existing
usages for 'i' and 'pos' that I mentioned in a response on a
previous thread.

Thanks
Jeff
Junio C Hamano Feb. 20, 2024, 7:24 p.m. UTC | #5
Jeff Hostetler <git@jeffhostetler.com> writes:

> I'll split this into 2 commits.  One for the refactor and one for
> the new return value.

And the latter one that makes the return value richer contains the
caller that makes use of the returned value?  That's great.  It
would make it very much easier to read the resulting commit, as the
presence of the callers and how they use the returned value would
make it self evident why it makes sense to return the number of
entries invalidated.

Thanks.
diff mbox series

Patch

diff --git a/fsmonitor.c b/fsmonitor.c
index 614270fa5e8..754fe20cfd0 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -219,24 +219,40 @@  static void fsmonitor_refresh_callback_unqualified(
 	}
 }
 
-static void fsmonitor_refresh_callback_slash(
+/*
+ * The daemon can decorate directory events, such as a move or rename,
+ * by adding a trailing slash to the observed name.  Use this to
+ * explicitly invalidate the entire cone under that directory.
+ *
+ * The daemon can only reliably do that if the OS FSEvent contains
+ * sufficient information in the event.
+ *
+ * macOS FSEvents have enough information.
+ *
+ * Other platforms may or may not be able to do it (and it might
+ * depend on the type of event (for example, a daemon could lstat() an
+ * observed pathname after a rename, but not after a delete)).
+ *
+ * If we find an exact match in the index for a path with a trailing
+ * slash, it means that we matched a sparse-index directory in a
+ * cone-mode sparse-checkout (since that's the only time we have
+ * directories in the index).  We should never see this in practice
+ * (because sparse directories should not be present and therefore
+ * not generating FS events).  Either way, we can treat them in the
+ * same way and just invalidate the cache-entry and the untracked
+ * cache (and in this case, the forward cache-entry scan won't find
+ * anything and it doesn't hurt to let it run).
+ *
+ * Return the number of cache-entries that we invalidated.  We will
+ * use this later to determine if we need to attempt a second
+ * case-insensitive search.
+ */
+static int fsmonitor_refresh_callback_slash(
 	struct index_state *istate, const char *name, int len, int pos)
 {
 	int i;
+	int nr_in_cone = 0;
 
-	/*
-	 * The daemon can decorate directory events, such as
-	 * moves or renames, with a trailing slash if the OS
-	 * FS Event contains sufficient information, such as
-	 * MacOS.
-	 *
-	 * Use this to invalidate the entire cone under that
-	 * directory.
-	 *
-	 * We do not expect an exact match because the index
-	 * does not normally contain directory entries, so we
-	 * start at the insertion point and scan.
-	 */
 	if (pos < 0)
 		pos = -pos - 1;
 
@@ -245,7 +261,10 @@  static void fsmonitor_refresh_callback_slash(
 		if (!starts_with(istate->cache[i]->name, name))
 			break;
 		istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
+		nr_in_cone++;
 	}
+
+	return nr_in_cone;
 }
 
 static void fsmonitor_refresh_callback(struct index_state *istate, char *name)