diff mbox series

[2/7] maintenance: add --schedule option and config

Message ID 1783e80b8d3b8361d1d62947a49ba584685dacc4.1599234126.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Maintenance III: Background maintenance | expand

Commit Message

Jean-Noël Avila via GitGitGadget Sept. 4, 2020, 3:42 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

A user may want to run certain maintenance tasks based on frequency, not
conditions given in the repository. For example, the user may want to
perform a 'prefetch' task every hour, or 'gc' task every day. To assist,
update the 'git maintenance run' command to include a
'--schedule=<frequency>' option. The allowed frequencies are 'hourly',
'daily', and 'weekly'. These values are also allowed in a new config
value 'maintenance.<task>.schedule'.

The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'
config value for each enabled task to see if the configured frequency is
at least as frequent as the frequency from the '--schedule' argument. We
use the following order, for full clarity:

	'hourly' > 'daily' > 'weekly'

Use new 'enum schedule_priority' to track these values numerically.

The following cron table would run the scheduled tasks with the correct
frequencies:

  0 1-23 * * *    git -C <repo> maintenance run --scheduled=hourly
  0 0    * * 1-6  git -C <repo> maintenance run --scheduled=daily
  0 0    * * 0    git -C <repo> maintenance run --scheduled=weekly

This cron schedule will run --scheduled=hourly every hour except at
midnight. This avoids a concurrent run with the --scheduled=daily that
runs at midnight every day except the first day of the week. This avoids
a concurrent run with the --scheduled=weekly that runs at midnight on
the first day of the week. Since --scheduled=daily also runs the
'hourly' tasks and --scheduled=weekly runs the 'hourly' and 'daily'
tasks, we will still see all tasks run with the proper frequencies.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/maintenance.txt |  5 +++
 Documentation/git-maintenance.txt    | 13 +++++-
 builtin/gc.c                         | 67 +++++++++++++++++++++++++---
 t/t7900-maintenance.sh               | 40 +++++++++++++++++
 4 files changed, 119 insertions(+), 6 deletions(-)

Comments

Đoàn Trần Công Danh Sept. 8, 2020, 1:07 p.m. UTC | #1
On 2020-09-04 15:42:01+0000, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> A user may want to run certain maintenance tasks based on frequency, not
> conditions given in the repository. For example, the user may want to

Hm, sorry but I couldn't decipher "not conditions" here. :|

> perform a 'prefetch' task every hour, or 'gc' task every day. To assist,

I think it's better to say: "To assist those users", at least it's
easier to read for non-native English like me.

> update the 'git maintenance run' command to include a
> '--schedule=<frequency>' option. The allowed frequencies are 'hourly',

So, we have "--schedule=" here, ...

> 'daily', and 'weekly'. These values are also allowed in a new config
> value 'maintenance.<task>.schedule'.
> 
> The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'

and here, ...

> config value for each enabled task to see if the configured frequency is
> at least as frequent as the frequency from the '--schedule' argument. We
> use the following order, for full clarity:
> 
> 	'hourly' > 'daily' > 'weekly'
> 
> Use new 'enum schedule_priority' to track these values numerically.
> 
> The following cron table would run the scheduled tasks with the correct
> frequencies:
> 
>   0 1-23 * * *    git -C <repo> maintenance run --scheduled=hourly
>   0 0    * * 1-6  git -C <repo> maintenance run --scheduled=daily
>   0 0    * * 0    git -C <repo> maintenance run --scheduled=weekly

but it's spelt with "--scheduled=", here and below, mispell, I guess.

Reading the patch, it looks like "--scheduled=" is mispelt.

> This cron schedule will run --scheduled=hourly every hour except at
> midnight. This avoids a concurrent run with the --scheduled=daily that
> runs at midnight every day except the first day of the week. This avoids
> a concurrent run with the --scheduled=weekly that runs at midnight on
> the first day of the week. Since --scheduled=daily also runs the
> 'hourly' tasks and --scheduled=weekly runs the 'hourly' and 'daily'
> tasks, we will still see all tasks run with the proper frequencies.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/config/maintenance.txt |  5 +++
>  Documentation/git-maintenance.txt    | 13 +++++-
>  builtin/gc.c                         | 67 +++++++++++++++++++++++++---
>  t/t7900-maintenance.sh               | 40 +++++++++++++++++
>  4 files changed, 119 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
> index 06db758172..70585564fa 100644
> --- a/Documentation/config/maintenance.txt
> +++ b/Documentation/config/maintenance.txt
> @@ -10,6 +10,11 @@ maintenance.<task>.enabled::
>  	`--task` option exists. By default, only `maintenance.gc.enabled`
>  	is true.
>  
> +maintenance.<task>.schedule::
> +	This config option controls whether or not the given `<task>` runs
> +	during a `git maintenance run --schedule=<frequency>` command. The
> +	value must be one of "hourly", "daily", or "weekly".
> +
>  maintenance.commit-graph.auto::
>  	This integer config option controls how often the `commit-graph` task
>  	should be run as part of `git maintenance run --auto`. If zero, then
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index b44efb05a3..3af5907b01 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -107,7 +107,18 @@ OPTIONS
>  	only if certain thresholds are met. For example, the `gc` task
>  	runs when the number of loose objects exceeds the number stored
>  	in the `gc.auto` config setting, or when the number of pack-files
> -	exceeds the `gc.autoPackLimit` config setting.
> +	exceeds the `gc.autoPackLimit` config setting. Not compatible with
> +	the `--schedule` option.
> +
> +--schedule::
> +	When combined with the `run` subcommand, run maintenance tasks
> +	only if certain time conditions are met, as specified by the
> +	`maintenance.<task>.schedule` config value for each `<task>`.
> +	This config value specifies a number of seconds since the last
> +	time that task ran, according to the `maintenance.<task>.lastRun`
> +	config value. The tasks that are tested are those provided by
> +	the `--task=<task>` option(s) or those with
> +	`maintenance.<task>.enabled` set to true.
>  
>  --quiet::
>  	Do not report progress or other information over `stderr`.
> diff --git a/builtin/gc.c b/builtin/gc.c
> index f8459df04c..85a3370692 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -704,14 +704,51 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  	return 0;
>  }
>  
> -static const char * const builtin_maintenance_run_usage[] = {
> -	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>]"),
> +static const char *const builtin_maintenance_run_usage[] = {
> +	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>] [--schedule]"),
>  	NULL
>  };
>  
> +enum schedule_priority {
> +	SCHEDULE_NONE = 0,
> +	SCHEDULE_WEEKLY = 1,
> +	SCHEDULE_DAILY = 2,
> +	SCHEDULE_HOURLY = 3,
> +};
> +
> +static enum schedule_priority parse_schedule(const char *value)
> +{
> +	if (!value)
> +		return SCHEDULE_NONE;
> +	if (!strcasecmp(value, "hourly"))
> +		return SCHEDULE_HOURLY;
> +	if (!strcasecmp(value, "daily"))
> +		return SCHEDULE_DAILY;
> +	if (!strcasecmp(value, "weekly"))
> +		return SCHEDULE_WEEKLY;
> +	return SCHEDULE_NONE;
> +}
> +
> +static int maintenance_opt_schedule(const struct option *opt, const char *arg,
> +				    int unset)
> +{
> +	enum schedule_priority *priority = opt->value;
> +
> +	if (unset)
> +		die(_("--no-schedule is not allowed"));
> +
> +	*priority = parse_schedule(arg);
> +
> +	if (!*priority)
> +		die(_("unrecognized --schedule argument '%s'"), arg);
> +
> +	return 0;
> +}
> +
>  struct maintenance_run_opts {
>  	int auto_flag;
>  	int quiet;
> +	enum schedule_priority schedule;
>  };
>  
>  /* Remember to update object flag allocation in object.h */
> @@ -1159,6 +1196,8 @@ struct maintenance_task {
>  	maintenance_auto_fn *auto_condition;
>  	unsigned enabled:1;
>  
> +	enum schedule_priority schedule;
> +
>  	/* -1 if not selected. */
>  	int selected_order;
>  };
> @@ -1250,8 +1289,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
>  			continue;
>  
>  		if (opts->auto_flag &&
> -		    (!tasks[i].auto_condition ||
> -		     !tasks[i].auto_condition()))
> +		    (!tasks[i].auto_condition || !tasks[i].auto_condition()))
> +			continue;

This line only add unnecessary noise to this patch.
Derrick Stolee Sept. 9, 2020, 12:14 p.m. UTC | #2
On 9/8/2020 9:07 AM, Đoàn Trần Công Danh wrote:
> On 2020-09-04 15:42:01+0000, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote:
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> A user may want to run certain maintenance tasks based on frequency, not
>> conditions given in the repository. For example, the user may want to
> 
> Hm, sorry but I couldn't decipher "not conditions" here. :|

Awkward, yes. I intended to contrast frequency-based maintenance with
threshold-based maintenance (git gc --auto).

>> perform a 'prefetch' task every hour, or 'gc' task every day. To assist,
> 
> I think it's better to say: "To assist those users", at least it's
> easier to read for non-native English like me.

Thanks.

>> update the 'git maintenance run' command to include a
>> '--schedule=<frequency>' option. The allowed frequencies are 'hourly',
> 
> So, we have "--schedule=" here, ...
> 
>> 'daily', and 'weekly'. These values are also allowed in a new config
>> value 'maintenance.<task>.schedule'.
>>
>> The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'
> 
> and here, ...
> 
>> config value for each enabled task to see if the configured frequency is
>> at least as frequent as the frequency from the '--schedule' argument. We
>> use the following order, for full clarity:
>>
>> 	'hourly' > 'daily' > 'weekly'
>>
>> Use new 'enum schedule_priority' to track these values numerically.
>>
>> The following cron table would run the scheduled tasks with the correct
>> frequencies:
>>
>>   0 1-23 * * *    git -C <repo> maintenance run --scheduled=hourly
>>   0 0    * * 1-6  git -C <repo> maintenance run --scheduled=daily
>>   0 0    * * 0    git -C <repo> maintenance run --scheduled=weekly
> 
> but it's spelt with "--scheduled=", here and below, mispell, I guess.> 
> Reading the patch, it looks like "--scheduled=" is mispelt.

Yes, a previous version used "scheduled" and I didn't fix it here.

>> @@ -1250,8 +1289,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
>>  			continue;
>>  
>>  		if (opts->auto_flag &&
>> -		    (!tasks[i].auto_condition ||
>> -		     !tasks[i].auto_condition()))
>> +		    (!tasks[i].auto_condition || !tasks[i].auto_condition()))
>> +			continue;
> 
> This line only add unnecessary noise to this patch.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index 06db758172..70585564fa 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -10,6 +10,11 @@  maintenance.<task>.enabled::
 	`--task` option exists. By default, only `maintenance.gc.enabled`
 	is true.
 
+maintenance.<task>.schedule::
+	This config option controls whether or not the given `<task>` runs
+	during a `git maintenance run --schedule=<frequency>` command. The
+	value must be one of "hourly", "daily", or "weekly".
+
 maintenance.commit-graph.auto::
 	This integer config option controls how often the `commit-graph` task
 	should be run as part of `git maintenance run --auto`. If zero, then
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index b44efb05a3..3af5907b01 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -107,7 +107,18 @@  OPTIONS
 	only if certain thresholds are met. For example, the `gc` task
 	runs when the number of loose objects exceeds the number stored
 	in the `gc.auto` config setting, or when the number of pack-files
-	exceeds the `gc.autoPackLimit` config setting.
+	exceeds the `gc.autoPackLimit` config setting. Not compatible with
+	the `--schedule` option.
+
+--schedule::
+	When combined with the `run` subcommand, run maintenance tasks
+	only if certain time conditions are met, as specified by the
+	`maintenance.<task>.schedule` config value for each `<task>`.
+	This config value specifies a number of seconds since the last
+	time that task ran, according to the `maintenance.<task>.lastRun`
+	config value. The tasks that are tested are those provided by
+	the `--task=<task>` option(s) or those with
+	`maintenance.<task>.enabled` set to true.
 
 --quiet::
 	Do not report progress or other information over `stderr`.
diff --git a/builtin/gc.c b/builtin/gc.c
index f8459df04c..85a3370692 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -704,14 +704,51 @@  int cmd_gc(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
-static const char * const builtin_maintenance_run_usage[] = {
-	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>]"),
+static const char *const builtin_maintenance_run_usage[] = {
+	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>] [--schedule]"),
 	NULL
 };
 
+enum schedule_priority {
+	SCHEDULE_NONE = 0,
+	SCHEDULE_WEEKLY = 1,
+	SCHEDULE_DAILY = 2,
+	SCHEDULE_HOURLY = 3,
+};
+
+static enum schedule_priority parse_schedule(const char *value)
+{
+	if (!value)
+		return SCHEDULE_NONE;
+	if (!strcasecmp(value, "hourly"))
+		return SCHEDULE_HOURLY;
+	if (!strcasecmp(value, "daily"))
+		return SCHEDULE_DAILY;
+	if (!strcasecmp(value, "weekly"))
+		return SCHEDULE_WEEKLY;
+	return SCHEDULE_NONE;
+}
+
+static int maintenance_opt_schedule(const struct option *opt, const char *arg,
+				    int unset)
+{
+	enum schedule_priority *priority = opt->value;
+
+	if (unset)
+		die(_("--no-schedule is not allowed"));
+
+	*priority = parse_schedule(arg);
+
+	if (!*priority)
+		die(_("unrecognized --schedule argument '%s'"), arg);
+
+	return 0;
+}
+
 struct maintenance_run_opts {
 	int auto_flag;
 	int quiet;
+	enum schedule_priority schedule;
 };
 
 /* Remember to update object flag allocation in object.h */
@@ -1159,6 +1196,8 @@  struct maintenance_task {
 	maintenance_auto_fn *auto_condition;
 	unsigned enabled:1;
 
+	enum schedule_priority schedule;
+
 	/* -1 if not selected. */
 	int selected_order;
 };
@@ -1250,8 +1289,10 @@  static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 			continue;
 
 		if (opts->auto_flag &&
-		    (!tasks[i].auto_condition ||
-		     !tasks[i].auto_condition()))
+		    (!tasks[i].auto_condition || !tasks[i].auto_condition()))
+			continue;
+
+		if (opts->schedule && tasks[i].schedule < opts->schedule)
 			continue;
 
 		trace2_region_enter("maintenance", tasks[i].name, r);
@@ -1274,13 +1315,23 @@  static void initialize_task_config(void)
 
 	for (i = 0; i < TASK__COUNT; i++) {
 		int config_value;
+		char *config_str;
 
-		strbuf_setlen(&config_name, 0);
+		strbuf_reset(&config_name);
 		strbuf_addf(&config_name, "maintenance.%s.enabled",
 			    tasks[i].name);
 
 		if (!git_config_get_bool(config_name.buf, &config_value))
 			tasks[i].enabled = config_value;
+
+		strbuf_reset(&config_name);
+		strbuf_addf(&config_name, "maintenance.%s.schedule",
+			    tasks[i].name);
+
+		if (!git_config_get_string(config_name.buf, &config_str)) {
+			tasks[i].schedule = parse_schedule(config_str);
+			free(config_str);
+		}
 	}
 
 	strbuf_release(&config_name);
@@ -1324,6 +1375,9 @@  static int maintenance_run(int argc, const char **argv, const char *prefix)
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
+		OPT_CALLBACK(0, "schedule", &opts.schedule, N_("frequency"),
+			     N_("run tasks based on frequency"),
+			     maintenance_opt_schedule),
 		OPT_BOOL(0, "quiet", &opts.quiet,
 			 N_("do not report progress or other information over stderr")),
 		OPT_CALLBACK_F(0, "task", NULL, N_("task"),
@@ -1344,6 +1398,9 @@  static int maintenance_run(int argc, const char **argv, const char *prefix)
 			     builtin_maintenance_run_usage,
 			     PARSE_OPT_STOP_AT_NON_OPTION);
 
+	if (opts.auto_flag && opts.schedule)
+		die(_("use at most one of --auto and --schedule=<frequency>"));
+
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index e0ba19e1ff..328bbaa830 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -264,4 +264,44 @@  test_expect_success 'maintenance.incremental-repack.auto' '
 	done
 '
 
+test_expect_success '--auto and --schedule incompatible' '
+	test_must_fail git maintenance run --auto --schedule=daily 2>err &&
+	test_i18ngrep "at most one" err
+'
+
+test_expect_success 'invalid --schedule value' '
+	test_must_fail git maintenance run --schedule=annually 2>err &&
+	test_i18ngrep "unrecognized --schedule" err
+'
+
+test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
+	git config maintenance.loose-objects.enabled true &&
+	git config maintenance.loose-objects.schedule hourly &&
+	git config maintenance.commit-graph.enabled true &&
+	git config maintenance.commit-graph.schedule daily &&
+	git config maintenance.incremental-repack.enabled true &&
+	git config maintenance.incremental-repack.schedule weekly &&
+
+	GIT_TRACE2_EVENT="$(pwd)/hourly.txt" \
+		git maintenance run --schedule=hourly 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <hourly.txt &&
+	test_subcommand ! git commit-graph write --split --reachable \
+		--no-progress <hourly.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress <hourly.txt &&
+
+	GIT_TRACE2_EVENT="$(pwd)/daily.txt" \
+		git maintenance run --schedule=daily 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <daily.txt &&
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <daily.txt &&
+	test_subcommand ! git multi-pack-index write --no-progress <daily.txt &&
+
+	GIT_TRACE2_EVENT="$(pwd)/weekly.txt" \
+		git maintenance run --schedule=weekly 2>/dev/null &&
+	test_subcommand git prune-packed --quiet <weekly.txt &&
+	test_subcommand git commit-graph write --split --reachable \
+		--no-progress <weekly.txt &&
+	test_subcommand git multi-pack-index write --no-progress <weekly.txt
+'
+
 test_done