diff mbox series

[17/23] fsmonitor--daemon: periodically truncate list of modified files

Message ID 9f263e70c7245a01210ac743ce3dfda4dcc08308.1617291666.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Builtin FSMonitor Feature | expand

Commit Message

Jeff Hostetler April 1, 2021, 3:40 p.m. UTC
From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to periodically truncate the list of
modified files to save some memory.

Clients will ask for the set of changes relative to a token that they
found in the FSMN index extension in the index.  (This token is like a
point in time, but different).  Clients will then update the index to
contain the response token (so that subsequent commands will be
relative to this new token).

Therefore, the daemon can gradually truncate the in-memory list of
changed paths as they become obsolete (older that the previous token).
Since we may have multiple clients making concurrent requests with a
skew of tokens and clients may be racing to the talk to the daemon,
we lazily truncate the list.

We introduce a 5 minute delay and truncate batches 5 minutes after
they are considered obsolete.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 78 +++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

Comments

Derrick Stolee April 27, 2021, 1:24 p.m. UTC | #1
On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Teach fsmonitor--daemon to periodically truncate the list of
> modified files to save some memory.
> 
> Clients will ask for the set of changes relative to a token that they
> found in the FSMN index extension in the index.  (This token is like a
> point in time, but different).  Clients will then update the index to
> contain the response token (so that subsequent commands will be
> relative to this new token).
> 
> Therefore, the daemon can gradually truncate the in-memory list of
> changed paths as they become obsolete (older that the previous token).

s/older that/older than/

> Since we may have multiple clients making concurrent requests with a
> skew of tokens and clients may be racing to the talk to the daemon,
> we lazily truncate the list.
> 
> We introduce a 5 minute delay and truncate batches 5 minutes after
> they are considered obsolete.

5 minutes seems like a good default timeframe. We can consider
making this customizable in the future.

> +/*
> + * To keep the batch list from growing unbounded in response to filesystem
> + * activity, we try to truncate old batches from the end of the list as
> + * they become irrelevant.
> + *
> + * We assume that the .git/index will be updated with the most recent token
> + * any time the index is updated.  And future commands will only ask for
> + * recent changes *since* that new token.  So as tokens advance into the
> + * future, older batch items will never be requested/needed.  So we can
> + * truncate them without loss of functionality.
> + *
> + * However, multiple commands may be talking to the daemon concurrently
> + * or perform a slow command, so a little "token skew" is possible.
> + * Therefore, we want this to be a little bit lazy and have a generous
> + * delay.

I appreciate this documentation of the "expected" behavior and how it
compares to the "possible" behavior.

> + * The current reader thread walked backwards in time from `token->batch_head`
> + * back to `batch_marker` somewhere in the middle of the batch list.
> + *
> + * Let's walk backwards in time from that marker an arbitrary delay
> + * and truncate the list there.  Note that these timestamps are completely
> + * artificial (based on when we pinned the batch item) and not on any
> + * filesystem activity.
> + */
> +#define MY_TIME_DELAY (5 * 60) /* seconds */

Perhaps put the units into the macro? MY_TIME_DELAY_SECONDS?
> +static void fsmonitor_batch__truncate(struct fsmonitor_daemon_state *state,
> +				      const struct fsmonitor_batch *batch_marker)
> +{
> +	/* assert state->main_lock */

If this comment is intended to be a warning for consumers that they should
have the lock around this method, then maybe that should be in a documentation
comment above the method declaration.

> +	const struct fsmonitor_batch *batch;
> +	struct fsmonitor_batch *rest;
> +	struct fsmonitor_batch *p;
> +	time_t t;

This is only used within the for loop, so it could be defined there.

> +
> +	if (!batch_marker)
> +		return;
> +
> +	trace_printf_key(&trace_fsmonitor, "TRNC mark (%"PRIu64",%"PRIu64")",

What's the value of abbreviating "truncate" like this? Is there a special
reason?

> +			 batch_marker->batch_seq_nr,
> +			 (uint64_t)batch_marker->pinned_time);
> +
> +	for (batch = batch_marker; batch; batch = batch->next) {
> +		if (!batch->pinned_time) /* an overflow batch */
> +			continue;
> +
> +		t = batch->pinned_time + MY_TIME_DELAY;
> +		if (t > batch_marker->pinned_time) /* too close to marker */
> +			continue;> +
> +		goto truncate_past_here;
> +	}
> +
> +	return;
> +
> +truncate_past_here:
> +	state->current_token_data->batch_tail = (struct fsmonitor_batch *)batch;
> +
> +	rest = ((struct fsmonitor_batch *)batch)->next;
> +	((struct fsmonitor_batch *)batch)->next = NULL;
> +
> +	for (p = rest; p; p = fsmonitor_batch__free(p)) {
> +		trace_printf_key(&trace_fsmonitor,
> +				 "TRNC kill (%"PRIu64",%"PRIu64")",
> +				 p->batch_seq_nr, (uint64_t)p->pinned_time);
> +	}

I see that you are not using the method that frees the entire list so
you can trace each entry as it is deleted. That works.

> +}
> +
>  static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
>  {
>  	struct fsmonitor_batch *p;
> @@ -647,6 +716,15 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
>  			 * that work.
>  			 */
>  			fsmonitor_free_token_data(token_data);
> +		} else if (batch) {
> +			/*
> +			 * This batch is the first item in the list
> +			 * that is older than the requested sequence
> +			 * number and might be considered to be
> +			 * obsolete.  See if we can truncate the list
> +			 * and save some memory.
> +			 */
> +			fsmonitor_batch__truncate(state, batch);

Seems to work as advertised.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 32df392b25d3..e9a9aea59ad6 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -316,6 +316,75 @@  static void fsmonitor_batch__combine(struct fsmonitor_batch *batch_dest,
 			batch_src->interned_paths[k];
 }
 
+/*
+ * To keep the batch list from growing unbounded in response to filesystem
+ * activity, we try to truncate old batches from the end of the list as
+ * they become irrelevant.
+ *
+ * We assume that the .git/index will be updated with the most recent token
+ * any time the index is updated.  And future commands will only ask for
+ * recent changes *since* that new token.  So as tokens advance into the
+ * future, older batch items will never be requested/needed.  So we can
+ * truncate them without loss of functionality.
+ *
+ * However, multiple commands may be talking to the daemon concurrently
+ * or perform a slow command, so a little "token skew" is possible.
+ * Therefore, we want this to be a little bit lazy and have a generous
+ * delay.
+ *
+ * The current reader thread walked backwards in time from `token->batch_head`
+ * back to `batch_marker` somewhere in the middle of the batch list.
+ *
+ * Let's walk backwards in time from that marker an arbitrary delay
+ * and truncate the list there.  Note that these timestamps are completely
+ * artificial (based on when we pinned the batch item) and not on any
+ * filesystem activity.
+ */
+#define MY_TIME_DELAY (5 * 60) /* seconds */
+
+static void fsmonitor_batch__truncate(struct fsmonitor_daemon_state *state,
+				      const struct fsmonitor_batch *batch_marker)
+{
+	/* assert state->main_lock */
+
+	const struct fsmonitor_batch *batch;
+	struct fsmonitor_batch *rest;
+	struct fsmonitor_batch *p;
+	time_t t;
+
+	if (!batch_marker)
+		return;
+
+	trace_printf_key(&trace_fsmonitor, "TRNC mark (%"PRIu64",%"PRIu64")",
+			 batch_marker->batch_seq_nr,
+			 (uint64_t)batch_marker->pinned_time);
+
+	for (batch = batch_marker; batch; batch = batch->next) {
+		if (!batch->pinned_time) /* an overflow batch */
+			continue;
+
+		t = batch->pinned_time + MY_TIME_DELAY;
+		if (t > batch_marker->pinned_time) /* too close to marker */
+			continue;
+
+		goto truncate_past_here;
+	}
+
+	return;
+
+truncate_past_here:
+	state->current_token_data->batch_tail = (struct fsmonitor_batch *)batch;
+
+	rest = ((struct fsmonitor_batch *)batch)->next;
+	((struct fsmonitor_batch *)batch)->next = NULL;
+
+	for (p = rest; p; p = fsmonitor_batch__free(p)) {
+		trace_printf_key(&trace_fsmonitor,
+				 "TRNC kill (%"PRIu64",%"PRIu64")",
+				 p->batch_seq_nr, (uint64_t)p->pinned_time);
+	}
+}
+
 static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
 {
 	struct fsmonitor_batch *p;
@@ -647,6 +716,15 @@  static int do_handle_client(struct fsmonitor_daemon_state *state,
 			 * that work.
 			 */
 			fsmonitor_free_token_data(token_data);
+		} else if (batch) {
+			/*
+			 * This batch is the first item in the list
+			 * that is older than the requested sequence
+			 * number and might be considered to be
+			 * obsolete.  See if we can truncate the list
+			 * and save some memory.
+			 */
+			fsmonitor_batch__truncate(state, batch);
 		}
 	}